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APPLICATIONS AND PROBLEMS OF 
PRODUCTIVITY DATA 


Cuar.es E. Youne 
Westinghouse Electric Corporation 


The gain in output per man-hour in manufacturing since 
1899 appears to have been related fairly closely to increased 
use of power per worker. This raises the question whether the 
common assumption of a persistent rate of gain in productivity 
can be justified apart from corresponding increases in manu- 
facturing investment per employed worker. 

Manufacturing output has corresponded roughly to the 
product of an installed-horsepower index and the length of the 
work week. A sizable part of peak wartime output apparently 
came from overtime and extra-shift use of facilities, and ap- 
parently sustained high output and productivity in the early 
postwar years will require similarly intensive use of facilities. 

Real hourly wages in manufacturing since 1914 have corre- 
sponded closely to changes in productivity, and finished 
goods prices have corresponded closely to changes in unit labor 
cost (wage rates adjusted for productivity). 


T LEAST two major developments have brought the question of 
productivity increasingly to the fore in recent years. One has been 
the popular statistical pastime of projecting gross national product 
from the starting assumption of full employment; the other has been 
in connection with disputes as to the justification of wage increases. 
Without due consideration of the level of productivity, it is, of 
course, impossible to establish the quantity of output which is equiva- 
lent to stated assumptions as to the number of workers and their hours 
of work; it is also impossible to determine the quantity of output which 
employers, and hence customers, will receive for the funds expended in 
payrolls. 

It. is not my purpose in this paper to establish any innovations in 
technique, but simply to refresh your memories with some of the 
applications and problems of productivity data from the standpoint of 
a practicing business statistician. 
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Several aspects of the typical discussions of productivity are worthy 
of attention. First of all, there has been a surprising lack of discrimina- 
tion in the use of terms. An eminent Washington consultant, in a 
recently-published article, used the terms “production per worker” and 
“production per man-hour” as if they were interchangeable; yet it is 
instantly apparent that variations in production per worker are the 
joint result of changes in production per man-hour and of changes in 
hours worked. The variation in the length of the work week over the 
past fifteen or twenty years, or even in the last 6 years or 6 months, 
has been so pronounced that this particular oversight seems inexcus- 
able and scarcely conducive to clear thought or dependable conclusions. 

Perhaps an even more outstanding characteristic of the discussions 
of productivity has been the widespread acceptance of a uniform rate of 
gain in productivity somewhere in the neighborhood of 2 per cent per 
year. A recent tabulation in Dun’s Review showed the estimates of 
eight writers ranging from 2.0 per cent to 2.6 per cent per year increase 
in output per worker, with six of the eight ranging between 2.1 and 2.4 
per cent. These estimates apparently are based primarily on the rather 
scattered and dubious data for the period from 1899 through 1919, 
for they ignore the fact that Dr. Fabricant’s figures on output per 
man-hour in manufacturing from 1919 to 1939 show no evidence what- 
ever of an exponential trend; they appear convincingly linear. Produc- 
tion per worker, as opposed to production per man-hour, should show 
even less of an exponential tendency because of the reduced length of 
the typical work-week over the past twenty years. 

This common assumption of a persistent rate of gain in productivity 
opens two questions: first, the acceptability of long-range extrapola- 
tions on an exponential basis of a trend which for the latest twenty 
years has appeared to be arithmetic, and second, whether there is 
really any inherent and independent trend toward increased output 
per man-hour or per worker, or whether this increase may be a re- 
sultant of independent forces, lacking which the gains in productivity 
might cease. It seems to me more reasonable to consider increased 
efficiency as a result than as an inherent attribute of workers. 

In this connection we might note the implicit assumption in much 
of the discussion of productivity that production per man-hour is 
primarily a function of the mental and emotional attitude of the 
worker toward his job. 

This belief that productivity is an attribute of the individual prob- 
ably stems from two sources. First, it is a commonplace of factory ex- 
perience that employee “slowdowns” can seriously hinder production, 
while high morale and general enthusiasm can appreciably increase it. 
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PRODUCTIVITY 423 


These facts, of course, apply particularly to short intervals of time 
rather than to periods several years apart; they represent fluctuations 
about a general trend, not the trend itself. 

Second, in activities where skill and craftsmanship play an important 
part, we have all observed the vast differences between experts and 
beginners. A skilled painter can paint my whole house about as fast 
as I can do one room—and do a better job of it, too. A skilled carpenter 
can make my best efforts look clumsy and futile. A good assistant can 
cut my work in half, while a poor one only adds to my woes. But note 
that all of these pursuits are non-mechanical. A good painter with a 
spray-gun can far outperform a good painter with a brush; a good 
carpenter with power tools can outperform a good carpenter with 
hand tools, and a good tabulating machine operator can far outdo the 
most diligent longhand tabulator. The more skill and effort that can 
be transferred from the worker to his tools, the less imposing does in- 
dividual artistry and attitude become. In this age of mechanization the 
trend has been increasingly away from the skill and volition of the 
individual worker toward the speed, power and consistent performance 
of the machine. This trend has been especially pronounced in manu- 
facturing. 

In exploring the possible explanations of increased production per 
man-hour in manufacturing, we find numerous possibilities. On the side 
of mechanization are the application of increased power per worker 
and improvements in machines—greater speed, precision and spe- 
cialization, with expanded use of jigs and fixtures to transfer the 
requisite skill and consistency of performance as far as possible from 
the worker to the machine. Under the heading of industrial techniques 
are improved designs, simplification and standardization of products, 
refinements of materials, development of new processes (for example, 
extrusion, die-casting, powder metallurgy, electronic heat-treating, 
etc.), better plant layout and flow of work, improved material-handling 
equipment, job analysis, time and motion study, quality control and 
many others. Affecting the performance of the individual employee are 
improved lighting and ventilation, higher standards of health and 
safety, job training for the development of skills, sundry security and 
recreation programs to improve morale and incentive programs to 
induce maximum application and effort. 

Out of this list it is immediately apparent that the skill and applica- 
tion to duty of individual workers must play a relatively small part in 
the total changes in manufacturing output per man-hour over an 
extended period. In fact, while Dr. Fabricant’s figures indicate that 
manufacturing output per man-hour more than doubled between 1919 
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and 1939, it is difficult to believe that the average skill and application 
of several million manufacturing workers could have undergone any 


such proportionate change in those two decades. 
It does appear, on the other hand, that the much greater use of 


power per wage earner in manufacturing accounts for a very large 


RELATION OF PRODUCTIVITY TO HORSE POWER IN MANUFACTURING 
18992100 
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FIGURE 1 


proportion of the increased output per man-hour. Taking 1899 as 100 
in each case, Fabricant’s figure for output per man-hour in manu- 
facturing was 309 in 1939, while Census data on installed horsepower 
and wage earners in manufacturing showed horsepower per wage 
earner at 292. Using only the terminal years of this forty-year period, 
installed horsepower per wage earner in manufacturing increased 
95 per cent as much as did output per man-hour. 

The figures for these and other years are shown in Figure 1. The 
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dashed line indicates a point-for-point relationship; the sharper slope 
of the data since 1919 indicates that other factors than horsepower per 
wage earner have also operated to raise productivity, but the general 
similarity is still striking. For eight different combinations of terminal 
years, the median ratio of the increase in horsepower per wage earner to 
the increase in output per man-hour is 80 per cent. 

One weakness in this approach is illustrated by the fact that the 
data between 1932 and 1943 indicate a nearly-perfect negative cor- 
relation. The answer is that in 1932 the installed horsepower figure 
greatly overstated the horsepower actually in use, while in 1943 a 
large part of the installed horsepower was being used by successive 
shifts of workers. A full verification of the thesis would require data 
(which are not available) on horsepower actually in use per regular- 
shift wage earner. However, the consistency of the relationship in 
years which were marked by neither sharp depression nor general 
multiple-shift operation is indicative of at least a large degree of co- 
variation. 

A further assumption underlying this relationship is, of course, the 
intelligent application of power. A five-horsepower motor in the middle 
of the president’s office would have little value, nor would the applica- 
tion of a 100-horsepower motor to a two-horsepower job. In general, 
however, it appears that horsepower in manufacturing has been in- 
telligently applied, and that this intelligent application of power has 
contributed very substantially to the increased output per man-hour 
in manufacturing. 

As a partial verification of this thesis, installed horsepower in manu- 
facturing times the hours it is used should provide a close approxima- 
tion to manufacturing output. Here again it is necessary to resort to 
makeshift figures, using total installed horsepower instead of horse- 
power actually in use, and using weekly hours per wage earner as a 
substitute for weekly hours per horsepower. (For non-Census years, 
the installed horsepower figures were interpolated via correlation with 
Department of Commerce data on expenditures for new manufactur- 
ing equipment.) The results are shown in Figure 2. 

All the data in Figure 2 are index numbers based on 1899 as 100. 
The solid line is the product of an index of installed horsepower in 
manufacturing and an index of actual weekly hours per wage earner in 
manufacturing. The broken line is an index of manufacturing output, 
as reported by Fabricant, from 1899 to 1939, and by the Federal Re- 
serve Board since 1939. Three areas of divergence merit special atten- 
tion. First, between 1909 and the early 1920’s there is a persistent gap 
of about 20 points between the two lines. Second, in all the periods of 
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sharp depression—1921, the early 1930’s and 1938—the index of 
manufacturing output drops away from the product of installed horse- 
power and the work week, indicating that an abnormal portion of the 
horsepower became idle in those depression periods. Third, in all the 
peak periods—1929, 1937 and the 1940’s—the index of manufacturing 
cutput rises above the product of installed horsepower and the work 
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week, indicating that an abnormal portion of the horsepower was used 
on second and third shifts. Taking the figures at face value for the 
moment, they imply that during the peak years of war production one- 
third or more of manufacturing output was produced on extra shifts. 

This conclusion results directly from the thesis that regular-shift 
capacity can be measured by installed horsepower and the hours per 
week it is used. The wartime spread between manufacturing output 
and the product of horsepower times hours worked per week provides 
a rough measure of production on shifts not ordinarily worked. The 
spread amounts to 371 points of the peak output index of 887 (on an 
1899 base) reported by the Federal Reserve Board. This implies that 
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41.8 per cent of the peak output was represented either by extra- 
shift work or by wartime distortions of the index. If we accept the view 
expressed in a later paragraph that the wartime peak in the Federal 
Reserve Board’s index of manufacturing output may have been over- 
stated by as much as 15 per cent (a possibility that is difficult either 
to prove or disprove), then the portion of peak output attributable to 
above-normal use of equipment becomes 33 per cent instead of 42 per 
cent. Some portion of even this 33 per cent is undoubtedly attributable 
to abnormally intensive use of equipment on regular shifts, as well as 
to extra shifts. 

Since such a large portion of wartime manufacturing output ap- 
parently resulted from extra shifts, it follows that the plant capacity of 
the country is not nearly adequate to support a level of activity ap- 
proaching the wartime rates on the single eight-hour shift character- 
istic of peacetime production. Considerable extra-shift work will be 
involved if the full-employment levels of activity so generally discussed 
are attained within the next year or two, as I believe they will be. 

Reverting to the relationship between horsepower per wage earner 
and manufacturing output per man-hour, it again appears that exist- 
ing horsepower are not adequate to support even prewar levels of 
productivity under postwar full-employment conditions without resort 
to the double and triple use of horsepower involved in extra shifts. 

There is little question but that a still closer relationship could be 
established if we were able to establish for each period the number of 
horsepower actually in use and the hours of their use—that is, get a 
measure of horsepower-hours. While in this form the project sounds 
imposing, it might be approached through careful interpretation of 
data on kilowatt-hours consumed in manufacturing, which are easily 
convertible into horsepower-hours. 

The major problems involved would be isolating the records for 
power consumed in manufacturing plants and screening out the 
amount of power consumed in lighting and other loads not varying 
directly with productive activity. That power consumption is a prac- 
tical measure of productive activity is illustrated by the fact that 
numerous plant managers keep a close check each morning on the 
power load as a gauge of the total activity in the plant. 

It may be of interest to note, in passing, that the Federal Reserve 
Board’s index of manufacturing output in the peak war years of 1943 
and 1944 was about 15 per cent higher than a corresponding index of 
kilowatt-hours consumed in manufacturing based on Federal Power 
Commission data; the discrepancy developed after 1941, and was 








428 AMERICAN STATISTICAL ASSOCIATION 


considerably greater for the revised figures for 1942 than for the 
figures first reported. 

With both indexes at 100 for 1939, manufacturing output (exclud- 
ing mining) stood at 150 in 1941 and 176 in 1942, while power 
consumed in manufacturing was 147 in 1941 and 173 in 1942. The 
subsequent wartime revision of the Federal Reserve Index, however, 
lifted these figures to 154 in 1941 and 194 in 1942, or considerably 
above the power consumption index. By 1943 the Federal Reserve 
Index showed manufacturing output at 237 per cent of the 1939 
base, or roughly 15 per cent above the power consumption index 
of 207. 

In the search for an index of industrial activity that can be checked 
and verified from various angles, the available information on electric 
power consumed in manufacturing should be brought more prominently 
into the picture. 

The question of productivity is closely related to some of the most 
burning economic issues of the day. A concerted drive is on to increase 
the purchasing power of organized labor, and the accent on real pur- 
chasing power, in terms of goods and services, is evident in the in- 
sistence that wage increases shall not be reflected in price advances. 
Yet from the past record the evidence is clear that advances in real 
hourly wages have been closely tied, for at least three decades, to ad- 
vances in output per man-hour. The data are shown in Figure 3. Here 
again the dashed line indicates a point-for-point relationship. The data 
are index numbers on a 1939 base; the real income data are Bureau of 
Labor Statistics figures on average hourly earnings in all manufacturing 
divided by the BLS cost-of-living index, while the output per man-hour 
figures are the same as were used before, from Dr. Fabricant, extended 
by Federal Reserve Board and BLS data after 1939. 

In general, it is apparent that there must be some fairly close rela- 
tionship between the quantity of goods and services which an average 
hour’s work can buy and the quantity which an average hour’s work 
produces. 

Since the relationship among wages, prices and productivity is so 
close in this combination, it should not be surprising to find it close in 
another. Figure 4 tells the story of the relationship between unit labor 
cost and finished goods prices. The solid line shows the annual average 
of unit labor cost for all manufacturing, while the broken line shows 
the annual averages of the Bureau of Labor Statistics index of finished 
goods prices. The unit labor cost figures were derived by dividing 
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average hourly earnings in all manufacturing by the index of output 
per man-hour in all manufacturing. Both lines are on a 1939 base. 
These two illustrations point to the futility of attempting to force, 
solely through increases in money incomes, increases in real income 
that are not firmly rooted in high productivity. Such efforts consti- 
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tute a form of warfare between economic groups, since they can only 
succeed by depriving one group to enrich another. Further, taken in 
conjunction with the relationship between power per worker and in- 
creased productivity, they tend to reduce further gains in productivity 
by discouraging further investment in power equipment. It is invest- 
ment in plant and equipment that underlies both rising real incomes 
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and the rising productivity whose steady annual increase has been so 
glibly assumed. 

In view of the basic importance of productivity in appraising cur- 
rent and prospective economic trends, it is essential that more adequate 
information—and more general understanding of its significance—be 
developed. On the information side, the problem is primarily one of 
establishing satisfactory units of measurement for production and 
comparing the production measurements with the man-hour figures 
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related directly to them. It is obvious that for this purpose man-hours 
are not satisfactory units to measure production; man-hours divided 
by themselves cannot trace changes in productivity. To establish 
production units in heterogeneous industries such as electrical equip- 
ment or chemicals it may be necessary to go into considerable detail 
or resort to sampling from among hundreds or thousands of individual 
products. The measurement of productivity, however, can be no better 
than the measurement of production on which it is based. Let me stress 
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again the desirability of greater attention to power use as a check on 
the reasonableness of production data. 

As for widespread understanding of productivity—its origins, its sig- 
nificance and its limitations—any such achievement must perforce be 
gradual. Certainly there is much for professional economists and statis- 
ticians to learn before their findings can be broadly distributed 
throughout management, government and the general public. Many of 
the findings will run directly counter to ideas now popularly accepted, 
but it is a job that must be done if our democracy and our economic 
machine are to run on the fuel of facts—which, I take it, is why most of 
us are here today. 








INSPECTION EFFICIENCY AND SAMPLING 
INSPECTION PLANS 


Marvin LAVIN 
Corning Glass Works 


The published sampling inspection plans contain the as- 
sumntion that the inspection operation is completely efficient, 
iat is, the items examined are invariably classified correctly. 
Some contributing factors to lack of inspection efficiency are 
noted, and an analysis of the validity of the guarantees of the 
plans in the presence of inspection error is made. 


HE WORK of Dodge and Romig, Wald, Wolfowitz, Bartky, Freeman, 
er other mathematical statisticians in the construction and analy- 
sis of sampling inspection plans has provided industrial quality control 
personnel with useful methods for attacking quality and inspection 
problems. 

The various available sampling inspection plans provide features 
such as maintenance of chosen risks for wrong decisions on lot quality, 
mvnimization of inspection or of sampling, and maintenance of a 
selected AOQL, if the hypotheses upon which the inspection procedures 
are based are not severely violated in practice. To obtain the desired 
features of an inspection plan, the user must meet such representative 
conditions as randomness of sampling and the replacing of defective 
units, when found, by non-defectives; he, in some instances, must have 
a priori information as to the process average quality and its state of 
statistical control; he must provide some basis, economic or otherwise, 
for the choosing of tolerable risks of wrong decisions as to lot quality. 

The published form of all sampling inspection plans with which the 
author is familiar requires the assumption that the inspection operation 
iscompletely efficient, that is, the items examined are invariably classified 
correctly. This requirement, it appears, is a very critical one, particu- 
larly in industries in which “‘visual”’ inspection is of prime importance. 
In the present state of many inspection methods, this condition is far 
from being met, and it is often apparent that the efficiency of the in- 
spector in classifying individual items correctly is sufficiently low to 
nullify the theoretical features of the well-known sampling inspection 
procedures, when applied under the assumption of complete efficiency. 
Sustained efforts should certainly be made to remove the sources of in- 
spection error; however, realistic analysis of the effectiveness of 
sampling inspection requires in many cases the consideration of the 
presence of repeated inspection mistakes. 
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People of long inspection experience have suggested that the follow- 
ing factors contribute to a lowered inspection efficiency (considering 
the simple case of rating an item as “good” or “bad,” with special em- 
phasis on visual inspection as distinct from a gaging inspection opera- 
tion): 

(i) The average process (or lot) proportion defective influences in- 
spector efficiency. When the fraction defective is high, the risk 
of classifying a “‘good”’ item as “‘bad’’ seems also comparatively 
high, and the risk of classifying a “bad” item as “good” is 
relatively low. For a low process fraction defective, the situation 
as to the two types of errors is generally reversed. 

(ii) The rate at which the inspector is required to work clearly af- 
fects the validity of the results of the inspection operation. 
Though an optimum rate usually exists, it is not always known. 
In addition, where there is a continuous flow of product, as on 
a conveyor belt, the velocity of flow changes with different items 
of manufacture and is often independent of the number of in- 
spectors available to perform the operation. 

(iii) The nature of the product, as well as the defects for which the 
inspection is being made, is often significant. Even though the 
maximum acceptable degree for a visual defect is clearly defined 
by engineering specifications and is illustrated by an adequate 
set of samples (needless to say, this is not always the case), many 
borderline cases arise requiring subjective decisions by the in- 
spector. Moreover, some defects, by their physical nature, are 
difhcult to discover. 

(iv) A large number of personal factors also contribute to diminished 
inspection reliability, for example, extent of inspector experience 
and training, working environment, imperfect inspector vision, 
inspection fatigue, the inspector’s conception of the importance 
of the defects he is considering, indifference and carelessness, 
etc. 

To illustrate the effect of these factors, we may cite instances in which 
lots of ware which had been rejected by sampling inspection and sub- 
jected to a complete inspection, sometimes failed to pass a second 
sampling inspection, and occasionally after a second complete reinspec- 
tion, failed to pass a third sampling inspection! This discouraging situa- 
tion could be laid largely to low inspection efficiency, though the 
handling contributed some defects while the inspection operations were 
being performed. 

To examine more specifically how inspection error may disrupt the 
anticipated features of a sampling inspection scheme, one is led to con- 
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sider the “effective” fraction defective, say p’, which is related to the 
“true” fraction defective p by the equation 
P : 

p’ = p(l— po) + (1 — p)pr, (1) 
where 7; is the probability of classifying a non-defective as defective, 
and the probability of classifying a defective as non-defective is pe. In 
view of (1) in the enumeration of the factors contributing to inspection 
error, we note that p; and p2 may be functions of p. In this discussion 
our interest will be centered on the case in which p; and p, are constant, 
that is, p’ is a linear function of p, 

9 


p’ = pi t+ (1 — pi — Pn)p, (2) 
and also the case in which p; and p» are related linear functions of p, 
say, 
pi = a+ bp, 
p =c — bp, 
where again we obtain p’ as a linear function of p, 
p’ =a+(l—a+b-—c)p. (2’) 


One would ordinarily have little confidence in fitting a more compli- 
cated relationship to an inspection operation. 

Our fundamental concern will be with the transformation of the 
operating characteristic curve resulting from the introduction of the 
presence of inspection error. To take the simplest example, for a single 
sampling plan in which the sample size is n and the allowable number of 
defectives is c, we have the probability L(p) of accepting a lot of pro- 
portion defective p as 

c fn 
“= (” )rra, (p=1-9) (a) 
m=( m 
for efficient inspection and the usual conditions for the use of the ap- 
proximation (4). Introducing p; and pe, we obtain 

L'(p)= (” Jeo —p.—palp)(ne+ Brera) (5) 

m=(0 


which defines the effective operating characteristic curve. Since 
L'(p) = L(p: + [1 — p — pelp), (6) 


one operating characteristic curve may be derived from the other in a 
simple graphic, or arithmetic, manner in view of assumptions (2) and (2’). 
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As an example of the magnitude of the effect of errors in inspection 
on the operating characteristics of a sampling plan, we consider one 
based on a single sample procedure, designed to discharge the following 
requirements: 


a=.1, for py, = .02, (7) 
‘ 
8 =.1, for pr, = .05, 
where a is as usual defined to be the maximum risk of rejecting lots of 
quality p:, or better, and 8 is the maximum risk of accepting lots of 
quality p,, or worse. Taking the lot size to be 10,000 (say), one finds for 
a sample of 235, with seven as the number of allowable defectives, these 
requirements are closely discharged. Table I compares the operating 
characteristics of this sampling plan with those of the same plan when 

constant inspection errors of the magnitude 


Pi = .02 Pe = .05 (8) 


are assumed. 
TABLE I 
OPERATING CHARACTERISTICS OF SINGLE SAMPLE PLAN, n=235, ¢=7, 
WITH NO INSPECTION ERROR, AND WITH PROBABILITY 
OF ERROR, pi =.02, p:=.05 














Pp Lp p’ L,’ 
True Approximate Effective Approximate 
Proportion | Probability of Proportion Effective 
Defective Accepting Defective | Probability of 
Lot | Accepting Lot 
.000 | 1.000 .020 .90 
.005 | .999+ .025 | .76 
.010 .997+ .029 | 63 
015 | .97 .034 45 
.020 .90 .039 .30 
.025 .76 .043 21 
.030 .60 .048 13 
.035 | .42 .053 .07 
.040 .27 .057 | 04 
045 17 .062 .02 
.050 .10 .066 015 
.060 .03 076 .003 
.070 .008 .085 | 001 
080 | .001+ .094 000+ 





One direct and simple method of restoring the desired risks at the 
“acceptable” and “non-acceptable” quality levels immediately suggests 
itself. If we choose a sampling plan which maintains the chosen risks a 
and 8 for the effective fraction defective corresponding to the true ac- 
ceptable and non-acceptable quality levels, we obtain the desired 
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operating characteristics for at least the two salient quality levels. 
Since 


Pt, — Pu > Pty — Pur’ (9) 


a more discriminatory sampling scheme will be required; thus one of 
the penalties of the presence of inspection error is the need for increased 
sampling. 

To continue the previous example, the quality requirements (7) are 
transformed to 


a=.1 for pz,’ = .039, 


; (7’) 
B=.1, for p.,’ = .066. 

Table II presents the operating characteristics of the single sample 
scheme for which the sample size is 460 and the allowable number of 
defectives is 23. This provides approximately the proper risks (7’), with 
the constant inspection ineficiency p,=.02, p2=.05. These operating 
characteristics should be compared with those in the second column of 
Table I. 


TABLE II 


OPERATING CHARACTERISTICS OF THE SINGLE SAMPLE PLAN, 
n =460, c=23, WITH pi =.02, p:=.05 

















Pp p’ Lp’ 
True Effective Effective 
Proportion Proportion Probability of 
Defective Defective | Accepting Lot 
-000 .020 .999+ 
.005 -025 .999 
-010 .029 -995 
-015 -034 -97 
-020 -039 .90 
-025 .043 .80 
030 .048 -61 
-035 -053 .44 

.040 .057 -32 
-045 -062 18 
.050 .066 me 
.060 -076 .02 
.070 .085 .003 
.080 .094 000+ 





We have briefly noted the possibility of designing a sampling scheme 
whose operating characteristics in the environment of an estimated in- 
spection inefficiency satisfactorily reproduces those which would be 
anticipated under a plan allowing for no mistakes. We may now inquire 
as to the channel in which corrective energies could be most profitably 
directed, if the existing inspection inefficiency is not to be supinely 
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tolerated. It is easily recognized that the two types of inspection errors 
cause opposing changes in the operating characteristics of a sampling 
plan; for some value of p, say p*, they will exactly compensate, and this 
value is obtained from the relation 


p = pi t+ (1 — pi — pr)p, (10) 
that is, 
* — _ . (11) 
Pi + Po 


For a chosen sampling plan, when p<p%*, we will have L,’<Lp, that is, 
the probability of accepting the lot will be reduced as a result of inspec- 
tion inefficiency; the converse will be true for p>p*. Moreover, as p 
becomes farther removed from p*, the magnitude of the difference 
L,'—L, will increase monotonically. For small values of p, by writing 
equation (10) in the form 


pill — p) = ppr, (12) 


we deduce the approximate relation 


p* =—,) (13) 
P2 
and this suggests how seriously a large ratio of the two probabilities of 
error will affect the operating characteristics. 
More generally, an examination of the equation 


p’ = pit (1 — pi — pr)p (14) 


reveals for the most typical industrial situations in which p, pi, p2 are 
sensibly small 

p’ = pt pr. (15) 
The principal influence is thus exerted by that class of inspection failure 
in which good product is rated as bad. But it is exactly this class of 
error which is most readily controlled. With a procedure which provides 
for the inspector’s rejects being confirmed by a supervisor or a special 
reject inspector, we may sharply diminish the likelihood of such mis- 
takes, providing the policy of additional inspection can be justified on 
the basis of product economics or customer relations. It may be asserted 
that this scrutiny will result in a proneness of the inspector to fall into 
mistakes of the other type, passing bad product as good, to avoid 
criticism. However, with p,=0, we have 


p’ = p — pp, (16) 
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indicating, by comparison with (15), that the reduction in p; carries 
more weight in restoring the expected operating characteristics than 
the resulting increase in p2 further disrupts them. 

No general discussion will be attempted as to the change in average 
outgoing quality limit which will result from inspection error. The 
analysis is made cumbersome under the assumption that the probabil- 
ity of error p2 applies to the replacing of discarded items in samples and 
lots. Moreover, the AOQL may be attained for a value of p which is 
unrealistically large. We may note, however, one simple result which 
arises under the hypothesis that p:=0, perhaps as a result of special 
reject checks, and that all replaced items are indeed good product. 
Under these conditions, the average outgoing quality has the repre- 
sentation 


AO0Q’ = N-"[npp,2 + (N — n)pL,’ + (N — n)ppx(1 — L,’)], (17) 


where N is the lot size and the other symbols retain their earlier defini- 
tions. This reduces to 


P2 . 
p’; (18) 
1 — P2 





A0Q’ = N-\(N — n)p'L,’ + 


whose maximum is to be compared with that of the average outgoing 
quality 


AOQ = N-\(N — n)pLp (19) 


when no inspection mistakes are made. Direct examination of (18) indi- 
cates in the industrial range of p’ and p, that the extreme value of (18) 
is in a few instances increased to the extent of 1% over (19), but in 
most cases the increase is substantially less than 1%. For example, with 
our earlier sampling plan, n = 235, c=7, for which the AOQL is approxi- 
mately 1.9% under non-erring inspection, we obtain an increase on the 
order of 1%, when pe is assumed as large as .1 (providing the lot propor- 
tion defective remains in a limited range, say less than .20). 

Viewed broadly, then, the results of our discussion on the effects of 
inspection failures upon sampling inspection plans suggest that it is 
largely sufi.cient to eliminate the rating of non-defective product as 
defective in order to have assurance that the guarantees of the plan are 
substantially obtained. 
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ELASTICITY OF PHYSICAL QUANTITIES AND 
FLEXIBILITY OF UNIT PRICES IN THE 
DIMENSION OF TIME* 


Freperick C. MILs 
National Bureau of Economic Research 


When the interaction of the physical quantities and the 
unit prices of a given commodity is marked by regularities 
over time, whether in the recurrence of cyclical or seasonal 
patterns or in the persistence of secular relations, it is conven- 
ient to define the differential movement of quantity with 
respect to price by a coefficient of elasticity, of price with re- 
spect to quantity by a coefficient of flexibility. These are 
mathematically identical with the familiar coefficients of 
Marshall and Moore, but in this usage they relate to desig- 
nated temporal frameworks. The examples presented in the 
present paper deal with price-quantity interrelations in busi- 
ness cycles, in specific cycles in physical quantities, in specific 
cycles in unit prices, and in homogeneous secular periods. No 
attempt is made to eliminate or hold constant the dynamic 
influences that play upon the market. In each case we seek to 
measure the differential responsiveness of the quantity and 
price factors to the various impinging forces that operate 
within the temporal framework in question. If there are per- 
sistent regularities in the operation of these forces, appropri- 
ately designated coefficients of elasticity and flexibility can 
serve as useful instruments in the analysis of economic change. 


I 


ime and the concept of elasticity. The concept of elasticity, first 

formulated by Cournot in defining the law of demand, was given 
its familiar form by Alfred Marshall. In conventional usage the term 
is applied generally to the responsiveness of the demand for a given 
commodity to changes in the price of that commodity. More precisely, 
elasticity of demand is defined as the ratio of the relative change in 
quantity demanded (of a given commodity) to the corresponding rela- 
tive change in price, when the relative changes are infinitesimal. The 
demand function and a derived coefficient of elasticity are assumed to 
relate to a moment of time, or to a period within which there are no 
changes in tastes, technology, income distribution, or other factors 
that might modify the functional relationship between quantity de- 
manded and unit price. 


* Tamindebted to Maude R. Pech for aid in the preparation of this paper, and to H. Irving Forman 
for the construction of the charts. 
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The conceptual importance of this instrument is very great. Only 
modest progress has been made, however, in the empirical determina- 
tion of demand functions and demand elasticities. Observations on 
commodity prices and quantities purchased are ordered in time, and 
thus inevitably reflect the play of dynamic factors that tend to modify 
the shape and location of demand functions. Great ingenuity has been 
shown in the attempt to measure the influence of these dynamic 
factors and to eliminate their effects on the recorded observations. Use- 
ful approximations to the demand curves of theory have, indeed, been 
achieved for a limited number of commodities, but uncertainty at- 
taches even to these, and generalization of the results is hazardous. 
These efforts will continue and further progress will be made. It is 
probably inevitable, however, that residual traces of dynamic factors 
will remain in the best of these efforts and that major temporal 
modifications of attendant conditions wii! complicate empirical find- 
ings on demand functions for most commodities. 

The present paper deals with investigations looking in another direc- 
tion. In the study of economic processes, whether our concern be with 
cyclical movements or with long-term changes, the interrelated move- 
ments of unit prices and physical quantities are of central importance. 
These are movements in the dimension of time. They are affected by the 
diversity of forces acting upon both variables, as well as by the direct 
cause-and-effect relations between these variables. We can do some- 
thing, in the choice of the frame of reference, to organize and standard- 
ize these forces, but they cannot be held constant while the specific 
and presumably static relations between price and quantity are studied. 
The diverse impinging forces, as they operate over time, are part of the 
record available to us. The relations between price and quantity that 
we may study are not timeless, then, nor are they isolated interactions. 
To the extent, however, that there are consistent patterns in these 
interactions, patterns that repeat themselves with some degree of 
regularity, they are of clear concern to the economist. 

In this investigation we shall make use of a measure of elasticity 
mathematically identical with Marshall’s coefficient of elasticity of de- 
mand. We shall measure the elasticity of quantity, with reference to 
unit price, in various temporal frameworks. The phrase “with reference 
to unit price” will not mean that changes in quantity are a function of 
changes in price, in any causal sense. In some cases, perhaps in most 
cases, the quantities and prices of given commodities will be responding 
concurrently to the pressure of outside forces. We may say that here 
the measure of elasticity of physical quantities defines the differential 
responsiveness of quantity to these forces. It is, as in traditional usage, 
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a ratio of a relative change in quantity to the corresponding relative 
change in price, but the quantity change will not necessarily be at- 
tributable to the price change. In some cases it will be in part attribut- 
able to the change in unit price and in part to the play of other forces, 
but it will not be possible to distinguish these two elements. Thus the 
related changes in quantity and price may not be thought of as reflect- 
ing movements along fixed demand curves (or fixed supply curves). 

There would be little justification for this usage, and little hope of 
deriving significant results, if the time periods over which related quan- 
tity and price changes are to be studied were selected haphazardly. 
Under such circumstances coefficients of elasticity would be expected 
to change erratically, as different time units and different periods of 
time were employed. The justification must be found in the choice of 
time periods within which there is some regularity in the action of the 
impinging forces and in the related movements of commodity quantities 
and prices. We shall have more to say about the choice of temporal 
frameworks within which price-quantity and similar relations are to 
be studied. 

The quantities to be employed in studying the relations between 
commodity prices and quantities over time need not be restricted to 
quantities demanded or consumed. Various series defining changes in 
physical volumes—amounts produced, delivered, exported, ordered— 
may be used. For this reason, and for others, the results are not to be 
thought of as in any way approximating traditional demand functions 
or coefficients. Elasticity is a meaningful and desirable term, for present 
purposes, but the elasticity measured is not that of demand. Indeed, 
it is most important that the term elasticity of demand be restricted to 
the conventional meaning. To avoid confusion, appropriate qualifying 
adjectives will be employed in the present paper to indicate the type of 
elasticity involved and the temporal framework employed in given 
instances. In addition, a symbol other than the customary 7 will be 
used to designate measures of temporal elasticity. 

One further draft will be made upon conventional terminology. 
Henry L. Moore has used the term flexibility of price for the ratio of a 
relative change in the price of a commodity to the corresponding rela- 
tive change in quantity.! The coefficient of price flexibility is the recip- 
rocal of the coefficient of elasticity of demand.’ It is derived, of course, 
when price is treated as the dependent variable. We shall use flexibility 


1 “Elasticity of Demand and Flexibility of Prices,” Journal of the American Statistical Association, 


March 1922, pp. 8-19. 
2 This is not necessarily true when the two coefficients are empirically determined, with correlation 


less than unity. 
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of price as the reciprocal of elasticity of physical quantities, with quali- 
fying adjectives to indicate the temporal framework to which given 
measurements relate. A symbol other than g, introduced by Moore, 
will be employed, to avoid confusion with his coefficient. 


II 


Temporal periods for the study of price-quantity relations. Various 
instruments are available for measurement of the concurrent move- 
ments of physical volume and unit prices, as these elements change 
together over time, in mutual interaction or in response to outside 
forces. Our present concern is with two. The coefficient of elasticity is 
given by 

dx y 


e=--— 


dy 2x 
where z is a measure of physical quantity, y of corresponding unit 
prices. The coefficient of price flexibility is given by 


dy =z 


dx y 


The relationships are the conventional ones. The novel problem arises 
in the choice of temporal frameworks within which related changes in 
quantity and price are marked by regularities that give them economic 
significance. The following frames of reference give promise of yielding 
significant patterns. 

Business cycles. A reference framework defined by turning points 
(lows or highs) in general business activity in the United States 
has been set up by the National Bureau of Economic Research, 
in connection with its general study of business cycles.* A com- 
plete business cycle runs its course between successive lows (or 
highs). When a given series is corrected for seasonal variations the 
adjusted measurements on that series for one of these periods re- 

3 These turning points, for the period since the Civil War, are set forth below. For explanation of 


the derivation of the lows and highs, see Measuring Business Cycles, by Arthur F. Burns and Wesley C. 
Mitchell, National Bureau of Economic Research, 1946. 


Trough Peak Trough Trough Peak Trough 
Dec. 1867 June 1869 Dec. 1870 Aug. 1904 May 1907 June 1908 
Dec. 1870 Oct. 1873 Mar. 1879 June 1908 Jan. 1910 Jan. 1912 
Mar. 1879 Mar. 1882 May 1885 Jan. 1912 Jan. 1913 Dec. 1914 
May 1885 Mar. 1887 Apr. 1888 Dec. 1914 Aug. 1918 Apr. 1919 
Apr. 1888 July 1890 May 1891 Apr. 1919 Jan. 1920 Sept. 1921 
May 1891 Jan. 1893 June 1894 Sept. 1921 May 1923 July 1924 
June 1894 Dec. 1895 June 1897 July 1924 Oct. 1926 Dec. 1927 
June 1897 June 1899 Dec. 1900 Dec. 1927 June 1929 Mar. 1933 


Dec 1900 Sept. 1902 Aug. 1904 Mar. 1933 May 1937 May 1938 
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flect the play of forces associated with cycles in general business, 
of trend factors and random influences, and, if they be present, 
of specific cyclical factors independent of cycles in general busi- 
ness. If observations cover a number of business cycles the effects 
of random factors may be in some degree eliminated by averaging. 
(Intra-cycle trend is not removed.) It is possible, then, to deter- 
mine whether there is a persistent pattern in the behavior of a 
given series or in the related behavior of paired series within the 
reference framework provided by cycles in general business.‘ 
Within this framework we may study the differential responsiveness 
of the volume factor with reference to cyclical changes in unit 
prices (the cyclical elasticity of physical quantities) or the differ- 
ential responsiveness of unit prices with reference to cyclical 
changes in volume (the cyclical flexibility of prices). 

Specific cycles in quantities. Many physical volume series are marked 
by cycles specific to the particular series, cycles that may differ 
materially in timing and duration from cycles in business at large. 
Hog cycles and construction cycles are familiar examples of such 
distinctive rhythms; similar specific cycles are discernible in the 
records for many commodities. These cycles provide obvious 
frameworks within which the movements of prices with reference 
to concurrent quantity changes may be studied. 

Specific cycles in prices. Price series have their own distinctive cycles, 
for many commodities. The elasticity of quantities with reference 
to price changes may be studied in these frames of reference. In this 
case, as for general business cycles and specific cycles in quantities, 
there is no isolation of price and quantity movements. Other 
forces than those of price play upon quantities when specific price 
cycles provide frames of study. This is an unavoidable condition, 
with the data available to us. But price factors are given primary 
place, in so far as this is possible, in the selection of this framework, 
just as quantity factors are given primary place in the use of 
specific cycles in quantities as frameworks for the study of price 
flexibility. 

Long cycles. If the reality of long cycles can be established, these will 
provide frames similar to the reference cycles derived from the 
records of business cycles. The elasticity of quantities and the 
flexibility of prices within these cycles could then be measured, 
and their significance tested. 

Homogeneous secular periods. In definirg each of the cyclical periods 


4 Measurements of the behavior of a given series during a business cycle are said to define a refer- 
ence cycle. 
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named above we have sought to set off a section of time marked by 
some regularity in the forces at play, particularly the forces affect- 
ing the two variables with whose relative changes we are concerned. 
Regularity may be tested, for cyclical measures of this sort, with 
reference to the degree of variation found among observations for 
different cycles. Such tests are not possible when we deal with 
periods of nonrecurring change, but such periods are of central 
interest to the student of economic change. There are stages of 
economic development marked by the play of fairly persistent 
secular forces, these stages giving way to others during which quite 
different secular influences are operative. The years 1914-19 
marked such a secular turning point in many aspects of economic 
life in the United States. We may not set down precise criteria of 
secular homogeneity, but certain stages of development may be 
defined with reasonable accuracy. Interactions of quantities and 
prices, and their differential responsiveness to secular forces, may 
be measured for such periods by coefficients of secular or inter- 
cycle elasticity and flexibility. The significance of such measures 
is not open to exact statistical determination, but comparisons of 
measures for different periods and consideration of these measures 
in their full economic setting for given periods will provide bases 
for judgments as to their economic import and value. 

Special periods subject to the play of defined forves. The differential 
responsiveness of quantities and prices to particular forces may 
be studied for specified periods, known to be distinctive in the 
forces at work. Thus the period of NRA regulation is a sharply 
defined section of time during which special forces were acting 
upon the structure of production and the system of prices. Here, 
although regularity of pattern in the sense of repetitiveness is not 
to be expected, the condition that distinctive and identifiable 
forces are operating is realized. For such a period measures of elas- 
ticity and flexibility of the type here discussed may be highly 
serviceable. 






































This list of possible reference frames for use in studying the interac- 
tions of quantities and prices and their responsiveness to outside forces 
is not exhaustive. For certain commodities markedly subject to seasonal 
influences (such as eggs or butter), the year would be a suitable period 
for study. The essential feature is that particular forces be distin- 
guished, by the judicious selection of temporal frameworks. Persistence 
of operation of these forces within a given period, in the case of non- 
recurring actions, or persistence of pattern in different periods, in the 
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case of cyclical or seasonal forces, is requisite, of course, if the measure- 
ments are to have meaning. If purely random factors predominate no 
such persistence is to be expected. 


III 


Elasticity and flexibility in business cycles. For purposes of analysis 
the seasonally corrected price and quantity records for a given com- 
modity, during a given business cycle, are expressed in relative terms. 
The base of the relatives for a given economic series in a given business 
cycle is the average of the original monthly (or other) figures for that 
series in that cycle. Averages of these relatives are then struck for each 
of nine stages of that cycle. Stage I is the initial trough, stage V the 
peak, and stage IX the terminal trough. Stages II, III, and IV mark off 
successive thirds of the phase of expansion, and stages VI, VII, and 
VIII mark off successive thirds of the phase of contraction.5 

Stage averages for the price of steel billets and the production of 
steel ingots for ten business cycles in the United States between 1900 
and 1938 are given in Table 1. They are plotted by reference cycle 
stages in Figure 1. This chart shows the rapid advance in ingot produc- 
tion and the much more moderate rise in the prices of steel billets in 
the first stage of reference expansion, the continuing sharp increase in 
production and the accelerating but less rapid advance in prices during 
the remaining stages of expansion. The movements during contraction 
are generally symmetrical with those of expansion, with quantity 
produced declining much more rapidly than prices except in the ter- 
minal stage of the phase of contraction. 

These same materials are plotted in a coordinate framework in 
Figure 2. Here the movements of quantities are shown on the horizontal 
axis, the movements of prices on the vertical axis. This representation 
reveals more clearly the apparent responsiveness of the quantity factor 
to price changes, and the apparent flexibility of prices as quantities 
change, within the framework of business cycles. It is clear that no 
assumption of direct causality is justified in the study of this relation- 
ship. We may define the relative change in steel production with ref- 
erence to the corresponding relative change in steel prices between, 
say, reference cycle stages II and III. What we are measuring in so 
doing is not a change in quantity resulting from a corresponding 
change in price, but a change in quantity attributable to the influence 
of a combination of forces of which the price factor is one. Conversely, 
we may measure the differential responsiveness of unit prices with 


5 This procedure is described in detail in Measuring Business Cycles, op. cit. 
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reference to corresponding changes in physical quantities as both vari- 
ables change in the course of business cycles. In each case the change 
in the “independent variable” is to be thought of as a standard of 
reference rather than a primary causal influence. 

With these qualifications in mind it is convenient to define the rela- 
tive responsiveness of physical quantities and of unit prices by means 
of the coefficients of elasticity and flexibility previously described. 
These are given in Table 2 for the several cyclical stages and phases, 
and for the full cycle.* It will be understood that these measurements 
are based upon the average movements of steel ingot production and 


TABLE 1 


AVERAGES DEFINING THE REFERENCE CYCLE PATTERNS OF STEEL 
INGOT PRODUCTION AND THE PRICES OF STEEL BILLETS IN 
TEN BUSINESS CYCLES, 1900-38* 








Stage Averages 
(Reference cycle relatives) 





I II III IV v VI VII VIII IX 
Quantity 70 91 106 118 128 122 98 76 74 
Prices 86 87 98 109 114 113 103 94 90 





* Billets and ingots are produced at different stages of the manufacturing process, but for present 
purposes fluctuations in the output of ingots may be taken to define with acceptable accuracy volume 
changes related to cyclical movements in the prices of steel billets. Ingot production is the total for the 
United States. Prices are for Bessemer steel billets at Pittsburgh. 


steel billet prices over the ten reference cycles covered by the present 
observations. 

The coefficient of elasticity of steel ingots indicates that over the 
full reference cycle there is, on the average, a change of 2.17 per cent 


6 The coefficients are derived from the entries in Table 1. Stage coefficients are computed from 
the reference cycle relatives for successive stages. For interstage period II-III, for exam- 
ple, the coefficient of elasticity is the ratio of the relative rates of change—quantity to price—at 
the midpoint of the line joining the quantity and price observations for stages II and III. Thus, for this 
+15 92.5 
+11 98.5 
cycle relatives for stages I and V, those for the contraction phase from similar relatives for stages V 
and IX. (A stage or phase coefficient derived from the customary formula and relating to the midpoint 
of a straight line joining two pairs of observations on prices and quantities is, of course, identical with a 
coefficient of arc elasticity.) Full cycle measures are averages of the coefficients for the phases of expan- 
sion and contraction. When these phase coefficients are of the same sign, geometric means of arithmetic 
and harmonic averages of the two phase coefficients will give mutually consistent sets of elasticity and 
flexibility coefficients for the phases and the full cycle. When the phase measures are of opposite sign 
mutually consistent coefficients (i.e., measures of elasticity and flexibility that are reciprocals of one 
another for both phases and for the full cycle) may not be obtained by an averaging process. In such 
cases, and wherever the coefficients for the phases of expansion and contraction differ materially, it is 
well to place chief reliance, for interpretative purposes, on the coefficients for the separate phases rather 
than upon an average for the full cycle. 


period, e= = -+1.28. The coefficients for the expansion phase are derived from reference 
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in production for every change of 1 per cent in price, these changes 
being in the same direction. The separate coefficients for the phase of 
expansion and the phase of contraction are but slightly different. In 
both phases the relative changes in output are more than double the 
corresponding relative changes in unit price. 

The measures for the separate interstage periods are all positive, 


Fieure % 
Average Movements of Steel Ingot Production and 
Prices of Steel Billets in Ten Business Cycles 





























1900 — 1938 
130 
120 
LN 
410 
- \ 
/ 
Fd 
i i 
100 ? ~ 
lA ‘\ 
/ ‘ 
/ \ 
30 Z % Price 
4 
o* 
80 
Wit 
70 
60 L_1 ! es a a 
I ou Il IV Vv Vi Vil VI IX 
Cycle stages 
/EUVEVEEUTENE CRUUTS COVEN CUUEVECESUEESUEEUECDCEES 
° 12 24 36 48 


Months 


indicating that the concurrent changes of production and price are in 
the same direction in all cyclical periods. There are some notable dif- 
ferences in magnitude, however. In the first period of business recovery 
(between reference cycle stages I and II) the coefficient of elasticity 
has a value of +22.57. Quantities respond immediately and strongly 
to the first pressures of general business recovery. The rise in unit price 
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is slight at this stage. Similarly, the first impact of recession between 
reference cycle stages V and VI brings a much sharper drop in steel 
production than in steel prices (the coefficient of elasticity is +5.45). 
It is to be noted that the coefficients of elasticity for the two periods of 
contraction following stage VI are much greater than the coefficient 
of elasticity for the corresponding periods of business expansion. When 
contraction is well under way, quantity is relatively more responsive 
and billet prices are relatively less responsive than they are in corre- 


TABLE 2 


COEFFICIENTS OF ELASTICITY OF STEEL INGOT PRODUCTION AND 
FLEXIBILITY OF STEEL BILLET PRICES IN TEN 
BUSINESS CYCLES, 1900-38 








Stage Measures 

Interstage Period 
I- II- III- Iv- V- VI- VII- VIII- 
II III IV V VI VII VIII Ix 





Elasticity +22 .57 +1.28 +1.01 +1.81 +5.45 +2.36 +2.76 +0.61 
Flexibility +0.04 +0.78 +0.99 +0.55 +0.18 +0.42 +0.36 +1.64 


Phase Measures 





Expansion Contraction Full Cycle Measure 
Elasticity +2.09 +2.27 +2.17 
Flexibility +0.48 +0.44 +0.46 





sponding periods of business expansion. In all interstage periods except 
the terminal stage of recession the coefficients of elasticity exceed unity 
and the coefficients of price flexibility fall below unity. Only between 
reference cycle stages VIII and IX is the elasticity of output less than 
unity. The influence of recovery is reflected thus early in the retarded 
decline of physical output. Prices, however, continue to drop at a rate 
exceeding that prevailing during the first stage of contraction. Except 
for this one period, therefore, steel ingot production is elastic, positively 
(positively because the movements of production and price are in the 
same direction), and the prices of steel billets are inflexible, positively. 

The patterns of related price-quantity behavior represented in Fig- 
ures 1 and 2 are quite consistent over the period covered by the present 
records. Those for nine of the ten cycles are of the type represented 
by Figure 2, marked by price and quantity increases during general 
business expansions, price and quantity declines during general business 
contractions. Only for one of the ten cycles—that of 1924-27—is the 
pattern different. In this cycle we have prices declining during the 
phase of reference expansion instead of increasing, as in the nine other 
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cycles covered. Measures of conformity and variance tests indicate 
that the pattern of average behavior represented in Figure 2 is clearly 
significant. 

A quite different pattern of cyclical behavior is represented by meas- 
urements of the price of raw cotton and the quantity of cotton exports 
(Table 3 and Figure 3). These again relate to stages of cycles in business 
at large. The general relationship is clearly inverse. Unit prices con- 


Ficure 2 
Pattern of Steel ingot Production and Prices of Steel Billets 
in Ten Business Cycles, 1900 — 1938 
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form to the cycle, rising during the phase of general business expansion 
between reference cycle stages I and V (prices rise, in fact, with a slight 
lag, starting after stage II), and falling during business contraction 
(between reference cycle stages V and IX). These are, of course, prices 
in the United States; they reflect the cyclical pressures present in 
domestic markets. The pattern of quantity taovements reverses the 
cyclical tides (except during two interstage periods), declining during 
business expansion and rising during business contraction. Here, it is 
reasonable to assume, we have something approaching the short-term 
market relationships of conventional theory. The volume of cotton 
exports is strongly influenced by business conditions abroad and by 
cotton prices in American markets. Foreign business cycles are not 
synchronous with United States cycles and, indeed, differ from country 
to country. In the reference frame here set up, therefore, the influence of 
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foreign business conditions is variable, if not erratic; the one persistent 
force affecting the quantity of exports is the customary stimulating 
influence of falling prices upon demand, the customary depressing 
influence of rising prices. Accordingly, with averages covering many 
domestic cycles, the volume of cotton exports tends to be an inverse 
function of prices in the United States. This is the case for the present 
set of observations. 


TABLE 3 


AVERAGES DEFINING THE REFERENCE CYCLE PATTERNS OF COTTON 
EXPORTS AND COTTON PRICES IN SEVENTEEN 
BUSINESS CYCLES, 1870-1938* 








A. Stage Averages 
(Reference cycle relatives) 





I II III IV v VI VII VIII IX 
Quantity 102 108 104 95 89 95 103 96 106 
Prices 97 97 101 110 113 109 101 97 93 


B. Coefficients of Elasticity of Exports and Flexibility of Pricest 


Stage Measures 
Interstage Period 








I- II- III- Iv- V- VI- VII- VIII- 

II III IV Vv VI VII VIII Ix 
Elasticity —27.71 —0.93 —1.06 —2.42 —1.81 —1.06 +1.74 —2.35 
Flexibility —0.04 —1.07 —0.94 —0.41 -0.55 -0.94 +0.57 -—0.43 

Phase Measures 
Expansion Contraction Full Cycle Measure 

Elasticity —0.89 —0.90 —0.90 
Flexibility —1.12 1.11 1.11 





* The quantity series is that for raw domestic exports. The price is that for raw middling upland 


cottonin New York. 
t These coefficients are computed from stage averages carried to one decimal place, where the 
rounded averages for successive stages are equal. 


Measures of elasticity and flexibility are given in Part B of Table 3. 
The phase measures are nearly identical. The elasticity of foreign de- 
mand for American cotton, within the framework of domestic business 
cycles, is measured by a coefficient of —0.90. All but one of the inter- 
stage elasticity measures are negative; six of the eight are negative and 
numerically in excess of unity. 

The cyclical behavior of cotton exports and raw cotton prices is 
distinctly less consistent than is that of steel production and unit price. 
The patterns for the seventeen reference cycles, viewed separately, 
show considerable variation. Of the seventeen, six are of the same type 
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as the average pattern plotted in Figure 3. They are marked by prices 
that move with the general cyclical tide and by exports that move in- 
versely to that tide. The other eleven patterns differ widely in their 
characteristics. Tests of statistical significance indicate that raw cot- 
ton prices conform to the patterns of reference cycles in business, but 


Figure 3 
Pattern of Raw Cotton Exports and Raw Cotton Prices 
in Seventeen Business Cycles, 1870-1938 


Averages by reference cycle stages 
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that cotton exports do not move in a clearly significant pattern. The 
most common pattern is that corresponding to the average shown in 
Figure 3, but there is too much variation to justify the acceptance of 
this pattern as clearly typical. 

Measures similar to those given in Table 1 have been computed for 
64 commodities.? Combining the observations for these individual com- 

1 These measures, and the analytical procedures employed, are described in a monograph on Price- 
Quantity Interactions in Business Cycles (National Bureau of Economic Research, 1946). The para- 
graphs of the text immediately following are from this study. 


The sample of 64 commodities is reasonably comprehensive. It includes 32 raw materials, 32 manu- 
tured goods; 33 farm products, 31 nonfarm products; 48 producer goods, 22 consumer goods (including 
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modities we obtain measures descriptive of the average behavior of 
physical quantities and commodity prices at wholesale during cycles 
in general business. This average pattern, as defined by the entries in 
Table 4, Part A, is shown graphically in Figure 4. This aggregative 
picture, it is clear, is one of positive conformity of both prices and 
quantities to business cycles. In this framework general cyclical forces 
shape the fluctuations of both prices and quantities, overcoming any 
tendencies that may exist toward inverse movements. 


TABLE 4 


PATTERNS OF BEHAVIOR OF QUANTITIES AND PRICES DURING 
BUSINESS CYCLES, AVERAGES FOR 64 COMMODITIES 








A. Stage Averages 
(Reference cycle relatives) 





I II III IV v VI VII VIII IX 
Quantities 90 98 101 108 112 107 99 94 94 
Prices 94 99 104 110 112 110 99 91 90 


B. Coefficients of Elasticity of Quantities and Flexibility of Prices 
Stage Measures 
Interstage Period 








I- II- III- IvV- V- VI- VII- VIII- 
II III IV v VI VII VIII IX 
Elasticity +1.59 +0.69 +1.34 +1.47 +2.34 +0.71 +0.69 —0.32 
Flexibility +0.63 +1.45 +0.75 +0.68 +0.43 +1.41 +1.45 —3.12 
Phase Measures 
Expansion Contraction Full Cycle Mearure 
Elasticity +1.25 +0.81 +1.01 
Flexibility +0.80 +1.23 +0.99 





Our immediate interest is in the relative responsiveness of the price 
and quantity factors to the forces of business cycles. Over the full cycle 
(see the last entry in Part B, Table 4) there is little difference between 
the two, for the sample of commodities here studied. By convention, 
we should classify the quantity movements as elastic (positively), the 
price movements as inflexible (positively), since the coefficient e is 
greater than unity while f is less than unity. But the difference is too 
small to be significant. 





duplications). It is not, however, presented as representative of al] commodities entering into trade. 

There is considerable variation in the time coverage of the series included. For two commodities 
we have observations going back to 1858, covering twenty business cycles; for three commodities the 
coverage is restricted to the three business cycles that have run their course since 1924. The other 59 
commodities fall between these extremes. Observations for recent cycles are more numerous than those 
for earlier cycles. 
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If we go behind the approximate equality of the price and quantity 
measures for the full cycle we find notable differences in behavior from 
phase to phase and from stage to stage of reference cycles. The phase 


Figure 4 
Pattern of Related Price-Quantity Movements in Business Cycles 
64 Commodities 
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coefficients tell an illuminating story of related quantity and price 
changes in business cycles. During periods of general business expan- 
sion physical volume increases 1.25 per cent for every 1 per cent rise 
in price. During contractions commodity prices decline 1.23 per cent 
for every 1 per cent fall in physical volume. We shall find wide differ- 
ences among commodity groups in the correlated behavior of quantity 
and price, but for the aggregate of commodities here studied quantity 
is elastic, positively, in response to the stimulations of expansion, and 
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unit prices are flexible, positively, under the pressures of contraction. 

We trace these relations more closely in the measures for interstage 
periods. Quantities are elastic, relatively, in three of the four interstage 
periods of expansion (in period II-III alone are prices more responsive 
than quantities to cyclical pressures), inelastic in three of the four in- 
terstage periods of contraction (in period V—VI alone are quantities 
more responsive than prices to cyclical pressures). All the interstage 


TABLE 5 


AVERAGES DEFINING THE REFERENCE CYCLE PATTERNS OF FACTORY 
EMPLOYMENT AND AVERAGE HOURLY EARNINGS IN 
FOUR BUSINESS CYCLES, 1921-38* 








A. Stage Averages 
(Reference cycle relatives) 


I II III IV Vv VI VII VIII 1X 





Employment 90 96 102 109 114 111 102 94 90 
Average hourly earnings 93 95 98 101 106 107 107 103 102 


B. Coefficients of Elasticity of Employment and Flexibility of Average Hourly Earnings 
Stage Measures 
Interstage Period 
I- II- III- IV- V- VI- VII- VIII- 
II III IV Vv VI VII VIII Ix 





Elasticity of employment +3.03 +1.95 +2.20 +0.93 -—2.84 +42.70 +2.14 +4.46 
Flexibility of hourly earnings +0.33 +0.51 +0.45 +1.08 -0.35 +0.02 +0.47 +0.22 


Phase Measures 





Expansion Contraction Full Cycle Measure 
Elasticity of employment +1.80 +6.12 +3 .32 
Flexibility of hourly earnings +0.56 +0.16 +0.30 





* The employment series is based upon estimates by the United States Bureau of Labor Statistics 
of the number of skilled and unskilled workers in 13 to 90 manufacturing industries. Average hourly earn- 
ings are the National Industrial Conference Board estimates for skilled 1nd unskilled workers in 25 
manufacturing industries. The comparability of the two series is impaired, of course, by the difference 
in coverage, but the present estimates define the general cyclical pattern with reasonable accuracy. 


measures are positive (indicating direct relations between price and 
quantity changes) except that for the terminal period of contraction 
(VIII-IX). At this final stage prices are declining and quantities in- 
creasing,® but the increase in quantity is relatively less rapid than the 

decline in price, and the coefficient e is below unity. 
The coefficients for the eight interstage periods show a distinct and 
suggestive pattern of change. Following the decline in the elasticity of 
8 The measures of elasticity and flexibility in Table 4 are based upon stage averages carried to one 


decimal place. The slight rise in quantity between stages VIII and IX is not apparent in the rounded 
figures in Part A of Table 4 
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quantities after stage II, there is a progressive increase in the stage 
elasticity of physical quantities (and a corresponding progressive de- 
cline in the flexibility of unit prices) between periods II-III and V—VI 
of reference cycles. We may think of these two factors as representing 
alternative means by which markets respond to and adapt themselves 
to the pressures of rising demand during business expansion, of declin- 
ing demand during contraction. After the reversal of the second period, 


Figure 5 


Pattern of Factory Employment and Hourly Earnings 
in Four Business Cycles, 1921-1938 


Averages by reference cycle stages 
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accommodation is effected in increasing degree through physical out- 
put. Prices become decreasingly flexible, relative to quantities. There 
is a sharp contrast indeed between the situation prevailing between 
stages II and III, when prices advance 1.45 per cent for every 1 per 
cent rise in quantity, and that in the final period of reference expansion 
(IV-V) when prices advance only 0.68 per cent for every 1 per cent 
rise in quantity. These growing strictures on prices, relative to the 
forces affecting output and sales, are a notable feature of the present 
evidence bearing on the later stages of business expansion. 

In the first period of contraction (V-VI) quantities drop sharply, 
while prices decline but slightly. Thereafter the record is the opposite 
of that for expansion. There is a pronounced and progressive decline 
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in the elasticity of quantities (and a progressive increase in the flexi- 
bility of unit prices). Quantity falls at a declining rate, relatively to 
prices; prices decline at accelerating rates, relatively to quantities. 
Between stages V and VI prices decline 0.43 per cent for every 1 per 
cent fall in quantities; between stages VII and VIII average unit prices 
fall 1.45 per cent for every 1 per cent fall in quantities. As general con- 
traction spreads, and pervades the economy, resistances to continuing 
reductions in physical quantity are stronger, relatively, than the re- 
sistances to continuing price declines. For the sample of commodities 
here represented price is distinctly the more responsive factor, in all 
except the first period of business contraction. 

We deal, finally, in the reference cycle framework, with the volume 
of factory employment and average hourly earnings of factory workers. 
For these series we have observations covering four reference cycles 
between 1921 and 1938.° Reference cycle relatives, averaged by cyclical 
stages, are given in Table 5 together with derived measures of the elas- 
ticity of factory employment and the flexibility of hourly earnings. The 
joint behavior pattern is plotted in Figure 5. 

Employment is, of course, relatively more responsive to cyclical 
pressures than are hourly earnings. For the full cycle the elasticity of 
employment is defined by a coefficient of +3.32. For every change of 
1 per cent in earnings there is a concurrent change, in the same direc- 
tion, of 3.32 per cent in the number of workers employed. The expan- 
sion and contraction phases differ materially, however. The elasticity 
of employment averages +1.80 during periods of business expansion, 
+6.12 during contractions. In both phases the volume of employment 
is more responsive than earnings to the forces of business cycles, but 
during contraction the relative change in employment is sharply ac- 
centuated. The comparative inflexibility of average hourly earnings in 
contraction accounts for the difference, since the amplitude of employ- 
ment movements is the same in the two phases. 

The pattern of interstage changes is revealing. We note the relative 
elasticity of employment, the relative stability of earnings as general 
business expansion gets under way. The employment factor remains 
elastic, and hourly earnings relatively inflexible, through stage IV. 
The pressures of rising demand for labor are met, during most of the 
expansion phase, by sharp advances in the number employed, by 

* Measurements descriptive of the joint cyclical behavior of employment and hourly earnings in 
these four cycles may not be generalized without additional evidence. The period covered is short, and 
it is known that special circumstances influenced the relations in question during this period. We may 
note, however, that customary statistica] tests indicate that the average cyclical pattern of employ- 


ment changes is significant; hourly earnings show high conformity for the phase of expansion and for 
the full cycle, but there is no clear evidence of significance. 
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small advances in hourly earnings. But in the terminal period of expan- 
sion (between stages IV and V) the flexibility of earnings exceeds unity, 
while the elasticity of employment is less than unity. In the first stage 
of contraction the coefficients are negative, reflecting the persistence of 
advances in hourly earnings, although volume of employment has 
started to drop. The resistance of earnings to downward pressures is 
manifest in a flexibility measure close to zero for interstage period 
VI-VII. Thereafter, during the two remaining periods of business con- 
traction, the coefficients of flexibility are positive (i.e., hourly earnings 
decline as employment falls), but they remain well below unity. Hourly 
earnings are far less responsive to the forces of business contraction 
than to the forces of expansion. 


IV 


Measures of elasticity and flexibility relating to movements in the 
framework of business cycles have specialized meaning and uses. The 
relatives on which they rest do not, in most cases, reflect the full cyclical 
swings of the two factors, for the imposition of a standard reference 
frame will dampen or eliminate price or quantity fluctuations not syn- 
chronized with cycles in general business. To supplement the reference 
cycle record the fluctuations of commodity prices must be studied 
within a frame provided by specific cycles in the corresponding physical 
quantity series, and the fluctuations of physical quantities must be 
studied within a frame provided by specific cycles in prices. The three 
sets of records are needed to define cyclical market behavior. 

Specific cycles as frames of reference: Flexibility of prices during specific 
cycles in quantities. Between 1865 and 1937 seventeen specific cycles 
in the physical volume of hog receipts at Chicago may be identified. 
In the present analysis of this series, and in the related study of hog 
prices, the reference frame is provided by the cyclical turning points 
(i.e., the lows and highs) in the series of hog receipts itself. Phases of 
expansion and contraction and cyclical stages (from I to IX) are 
identified with the series of hog receipts, not with cycles in business at 
large. For some commodities, of course, specific cycles may coincide 
with or closely approach general reference cycles. Specific cycles in 
hog receipts do not conform to reference cycles.'® 

Stage measurements for hog prices and hog receipts, within the 
specific cycle framework for hog receipts, are given in Table 6 and are 
plotted in Figure 6. These are averages for seventeen specific cycles. 


10 The index of conformity is —10, ona scale that runs from —100 to +100. It is not significantly 
different from 0, which would represent no conformity, direct or inverse. 
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Conformity and variance tests indicate that the pattern of price 
changes, within this framework, is significant. The general picture is 
of an inverse relationship, with prices falling as receipts increase and 
rising as receipts diminish. In fitting the price series into the quantity 
framework we are giving the quantity series an independent status. We 
are, in effect, setting up the hypothesis that price is a function of 
quantity." 
TABLE 6 


AVERAGES DEFINING THE PATTERN OF HOG RECEIPTS IN SEVENTEEN 
SPECIFIC CYCLES AND THE PATTERN OF HOG PRICES IN 
THE QUANTITY FRAMEWORK, 1865-1937* 








A. Stage Averages 
(Cycle relatives, quantity framework) 





I II III IV V VI VII VIII IX 
Quantity 75 87 101 111 134 112 106 92 
Prices 118 104 91 91 91 99 111 114 119 


B. Coefficients of Flexibility of Prices in Quantity Framework 
Stage Measures 
Interstage Period 








I- II- III- Iv- V- VI- VII- VIII- 
II III IV Vv VI VII VIII IX 
Flexibility —0.85 —0.90 0 0 —0.47 —2.08 —0.18 —0.37 
Phase Measures 
Expansion Contraction Full Cycle Measure 
Flexibility —0.46 —0.55 —0.50 





* The quantity series defines hog receipts at Chicago, by number. Prices are for heavy hogs at 
Chicago, in 100 pound units. 


Stage and phase measures of the flexibility of hog prices, in response 
to cyclical changes in hog receipts, are given in Part B of Table 6. The 
coefficient for the full cycle is —0.50; the separate phase measures for 


11 This involves the assumption that, within the framework of specific cycles in quantities, causal 
relations run from changes in hog receipts to changes in prices. If the quantity series conformed closely 
to reference cycles the influence upon prices of specific variations in quantities would be indistinguish- 
able from the influence of cyclical forces at large. But when the specific quantity cycles are apparently 
unrelated to reference cycles, as in the present case, the assumption that price changes within the 
quantity framework reflect the influence of quantity changes is more tenable. Mutual interaction of 
prices and quantities is not ruled out and, it is understood, the influence of other forces is never clearly 
eliminated. However, the process of averaging observations covering many quantity cycles gives oppor- 
tunity for the offsetting and cancellation of forces not persistently operating in that specific framework. 

The hypothesis that prices are independent, with quantity fluctuations playing a dependent rol2, 
can be tested by studying the movements of hog receipts within a framework provided by specific 
cycles in hog prices. (This is done, for another commodity, in the example next following.) Since there 
is significant conformity of quantities within the price framework there is no clear basis for rejecting 
either hypothesis. This conclusion is consistent with the assumption that there is mutual interaction 
of hog prices and hog receipts within the present context of short- and medium-term market rela- 
tions. 
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expansion and contraction do not differ materially from this. In gen- 
eral, for every change of 1 per cent in the volume of hog receipts, dur- 
ing specific cycles in such receipts, there is an inverse change of about 
one-half of 1 per cent in prices. The interstage measurements indicate 
a fairly persistent pattern of price response to quantity changes, but 
one notable feature is to be remarked. The responsiveness of prices to 
increases in hog receipts is relatively strong (although the coefficients 
remain below unity) between interstage periods I and III. Receipts 


Ficure 6 
Pattern of Hog Receipts and Hog Prices 
in Seventeen Specific Cycles in Hog Receipts, 1865-1937 


Averages by specific cycle stages, quantity framework 
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continue to increase between stages III and V but prices do not de- 
cline. Whether this apparent drop in price flexibility represents acci- 
dents of sampling or a persistent economic characteristic of hog markets 
is a matter for investigation.” In the contraction phase of specific 
cycles in hog receipts, prices are most flexible during the first two peri- 
ods (between stages V and VII). When quantities decrease, as when 
they increase, hog prices tend to move inversely, the tendency being 
strongest during the first stages of decline and advance. Thereafter, 
prices become less responsive to continued movements of quantities. 

Elasticity of quantities during specific cycles in prices. Specific cycles in 


12 Records of price flexibility in response to changes in hog slaughter, covering 14 specific cycles in 
the slaughter records, reveal a similar sharp decline in flexibility after stage III of the expansion plase. 
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prices may be used as a framework within which the behavior of physi- 
cal volume series may be studied, just as quantity cycles provide a 
framework for the study of prices. As an example we study the pattern 
of lead ore shipments within a frame provided by specific cycles in the 
prices of lead ore. Measurements derived by averaging data for eleven 
specific cycles in lead prices, between 1896 and 1938, are given in Table 
7 and are plotted in Figure 7. Coefficients of the elasticity of lead ship- 
ments, with reference to lead prices, are given in Part B of Table 7. 


TABLE 7 


AVERAGES DEFINING THE PATTERN OF LEAD ORE PRICES IN ELEVEN 
SPECIFIC CYCLES AND THE PATTERN OF LEAD ORE SHIP- 
MENTS IN THE PRICE FRAMEWORK, 1896-1938* 








A. Stage Averages 
(Cycle relatives, price framework) 





I II III IV v VI VII VIII IX 
Quantity 78 93 102 114 134 112 110 95 88 
Prices 81 87 98 114 127 117 107 91 84 


B. Coefficients of Elasticity of Shipments in Price Framework 
Stage Measures 
Interstage Period 
I- II- III- IvV- vV- VI- VII- VIII- 
II III IV Vv VI VII VIII IX 





Elasticity +2.46 +0.78 +0.74 +1.50 +2.18 +0.20 +0.91 +0.96 


Phase Measures 
Expansion Contraction Full Cycle Measure 





Elasticity +1.20 +1.02 +1.11 





* Price data relate to the New York market. The quantity series measure shipments in the Joplin 
District. 


We should first note that the framework provided by specific cycles 
in lead prices does not differ greatly from the general framework of ref- 
erence cycles, for lead prices conform fairly well to general business 
cycles.% Since the two frameworks are so nearly synchronous we are 
not able to disentangle the influence of price changes upon lead ship- 
ments from the influence of general cyclical forces. Accordingly, al- 
though there is a temptation to interpret the relations shown in Figure 
7 as those characteristic of a conventional supply function, it would be 
unsafe to do so. The consistent rise in lead shipments as lead prices rise 
and the consistent decline in shipments as prices fall may be in part a 
reflection of a causal relationship running from prices to shipments. 


13 The index of conformity for the full cycle is +82. 
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It may also reflect the pressure of common forces, related to cycles in 
business at large, impinging upon both shipments and prices.” 

The response of lead ore shipments to price changes is measured by 
a coefficient of full cycle elasticity of +1.11. The phase coefficient for 
expansion is somewhat higher than that for contraction, but both 


Fisure 7 
Pattern of Prices and Shipments of Lead Ore 
in Eleven Specific Cycles in Lead Ore Prices, 1896-1938 


Averages by specific cycle stages, price framework 
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exceed unity. The stage coefficients indicate high elasticity of quantities 
in the first period of expansion and in the first period of contraction, 
with sharp declines in elasticity in the intermediate stages of expansion 


and contraction. 
V 


Elasticity and flexibility during homogeneous secular periods. We pass 
to relationships between secular movements of economic series. Here 
we do not seek recurring patterns, as we do in studying cyclical phe- 


4 Although we cannot separate the two sets of influences in this case, we should note that lead 
shipments conform more closely to specific cycles in lead prices than they do to cycles in general 
business. The two conformity indexes are +100 and +64, respectively, the former being highly signifi- 
cant, the latter significant. The variance test reenforces this evidence. There is an indication here that 
the relations shown in Figure 7 represent responses of quantities to independent price changes, although 
these movements are not isolated from cyclical movements in business at large. 
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nomena, but regularity that is manifest in a persistent inter-cycle re- 
lationship. In Table 8 are given cycle averages for agricultural produc- 
tion in the United States for the seven business cycles occurring be- 
tween 1911 and 1938, and ratios of the prices received by farmers to 
prices paid by farmers for goods used in farm production and family 
maintenance, averaged for the same seven cycles. These averages are 


TABLE 8 


VOLUME OF AGRICULTURAL PRODUCTION AND RATIO OF PRICES 
RECEIVED BY FARMERS TO PRICES PAID BY FARMERS* 
AVERAGES BY REFERENCE CYCLES, 1911-38 











Reference Cycle ‘ Ratio of Prices Received 
. Agricultural - 

Terminal Dates Poodustion? by Farmers to Prices 

(annual) Paid by Farmerst 
(Cycle average) 

1911-14 83.5 98.5 
1914-19 86.7 105.4 
1919-21 89.5 98.0 
1921-24 91.8 82.2 
1924-27 98.3 88.3 
1927-32 99.6 78.5 
1932-38 96.6 76.6 





* Data from the United States Bureau of Agricultural Economics. 
t Relatives on the base 1935-39. 
t Relatives on the base August 1909-July 1914. 


plotted, on double logarithmic scale, in Figure 8. The averag relae- 
tionship between the price ratios and the production series is defined 
by the straight line fitted to the observations plotted in Figure 8. 
The equation to the line is log Y = 4.8951—1.4995 log X, where log 
Y and log X are, respectively, the logarithms of the price ratios and the 
production indexes. 

The measure of immediate interest to us is the coefficient of log X 
in the above equation. This defines the inter-cycle flexibility of agri- 
cultural prices (“real” prices, in the sense that the ratio of prices re- 
ceived to prices paid defines average unit purchasing power of farm 
products).'® We interpret the coefficient, which has a value of —1.50, 
in this manner: Inter-cycle relations between farm output and farm 
prices were such, over the course of the seven cycles occurring between 
1911 and 1938, that for every increase of 1 per cent in the volume of 
agricultural production there was a decline of 1.50 per cent in the real 
unit price of farm products. Real farm prices, that is, were flexible, 
inversely, in response to changes in farm output. 


dy z dlogy 
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This relationship prevailed, with notable persistence, over the period 
of the present records. There were temporary departures from it but 
it is clear from the graphic record that between 1911 and 1938 rising 
output was accompanied, secularly, by still more sharply declining real 


Figure 8 
Volume of Agricultural Production and ‘Real’ Prices 
Received by Farmers over a Period of Seven Business Cycles 
1911-1938 


Averages by reference cycles 
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prices for farm products. The story for recent years is, of course, dif- 
ferent, and the record prior to 1911 was marked by other relationships. 
But for about a quarter century a persistent negative relationship 
prevailed between farm output and unit purchasing power, and set its 
mark upon the economic and social conditions of our time. 

A final example involves the comparison of inter-cycle movements 
(i.e., of secular changes) in the volume of manufacturing employment 
in the United States and output per wage earner in manufacturing 
plants (Table 9). The data are presented graphically in Figure 9, 
which is drawn on double logarithmic scale. In this case we are not 
dealing with unit prices and physical quantities, but the relations in- 
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volved lend themselves to definition by coefficients of elasticity and 
flexibility of the type employed in preceding examples. 

The secular relations between volume of employment in manufac- 
turing industries and output per wage earner underwent a marked 
change at about the middle of the period here covered. The averages of 
employment and productivity are positively related for the first five 
of the business cycles occurring since the turn of the century. The equa- 
tion of relationship is log Y = —1.3441+1.6853 log X, where log Y 


TABLE 9 
MANUFACTURING EMPLOYMENT AND OUTPUT PER WAGE EARNER 


IN THE UNITED STATES* 
AVERAGES BY REFERENCE CYCLES, 1900-38 














Reference Cycle Manufacturing Output per 

Terminal Dates Employment t Wage Earnert 
(annual) 

(Cycle average) 

1900-04 114.6 107.0 
1904-08 130.4 113.9 
1908-11 139.5 112.5 
1911-14 149.5 124.0 
1914-19 177.0 135.2 
1919-21 177.0 127.5 
1921-24 166.7 151.8 
1924-27 175.5 171.0 
1927-32 161.4 188.9 
1932-38 157.2 184.7 





* Data from Solomon Fabricant, Employment in Manufacturing, 1899-1989 (National Bureau of 
Economic Research, 1942). 
t Relatives on the base 1899. 


and log X are, respectively, logarithms of the indexes of employment 
and per capita output. For the six cycles between 1914-19 and 1932-38 
(the 1914-19 cycle, which marks a stage of apparant transition, is 
included in both sets of calculations) the relationship is negative. The 
equation is log Y = 2.7700 —0.2466 log X. 

The inter-cycle elasticity of employment, with respect to changes 
in per capita productivity in manufacturing industries, was +1.69 for 
the cycles from 1900-04 to 1914-19, —0.25 for the cycles from 1914- 
19 to 1932-38. That is, during the first of these periods there was a 
secular increase of 1.69 per cent in volume of manufacturing employ- 
ment for every advance of 1 per cent in per capita productivity; after 
1914-19 there was a secular decline of 0.25 per cent in employment for 
every advance of 1 per cent in per capita productivity. This is a notable 
reversal indeed. Advancing productivity was accompanied by in- 
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creasing employment opportunities in manufacturing during the first 
two decades of this century. During the second two decades produc- 
tivity continued to advance, but employment declined. It is true, of 
course, that the persistent depression of the ’30’s leaves a marked 


Figure 9 


Manufacturing Employment and Output per Wage Earner 
over a Period of Ten Business Cycles, 1900-1938 


Averages by reference cycles 
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impress on the cycle averages, but the reversal of earlier relationships 
seems to antedate the recession that began in 1929.'* 


16 Such a bare descriptive statement as is provided by a coefficient of employment elasticity with 
respect to one related variable does not, of course, provide an explanation of the reversal we have noted. 
Changes in the direction and volume of investment, the increasing importance of service industries, 
alterations in international economic relations al] influenced, directly or indirectly, the volume of em- 
ployment in manufacturing in the United States. Beyond such specific circumstances, some part of the 
explanation of the abrupt shift in earlier tendencies, and of the failure of manufacturing employment 
to advance as industrial efficiency increased, is to be sought in the working of the wage-cost-price 
mechanisms by which we adapt the use of resources to changing conditions of production and to shifts 
in demand. 
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VI 

Summary. In the study of demand and price relations emphasis has 
been placed, traditionally, on the question: What is the elasticity of 
demand (for a given commodity) with reference to changes in unit 
prices when no change occurs in any factor other than price that might 
affect the character of demand? In the present paper we have been con- 
cerned with such questions as these: Is quantity (of a given commodity) 
elastic or inelastic with reference to movements of unit prices within 
business cycles? Is quantity elastic or inelastic with reference to the 
trend of prices within relatively homogeneous secular periods? Similar 
questions might relate to the elasticity of quantities with reference to 
seasonal movements of prices, movements of prices within long cycles, 
hour-to-hour movements of prices, or year-to-year movements of 
prices. In dealing with these questions we deliberately introduce the 
factor of time, making no attempt to eliminate or hold constant the 
various forces other than those of volume and price that play upon the 
market. We seek to measure the differential responsiveness of the quan- 
tity and price factors to the various impinging forces that operate 
within a given temporal framework. In each case the meaning of the 
coefficient of quantity elasticity (or of the correlative measure of price 
flexibility) will be determined by the framework within which the 
price-quantity interactions are studied. The several coefficients for a 
given commodity may vary widely in value from framework to frame- 
work. (Thus the consumption of a given commodity may be inelastic 
within a seasonal framework, elastic within a framework of business 
cycles, inelastic with respect to secular changes in unit prices.) 

The examples given in the present paper have related, in the main, 
to quantities and prices, but measurements of the type suggested may 
be derived for any pair of series whose interaction in a temporal frame- 
work is of economic significance. The unit selling prices of manufac- 
tured goods and the labor cost of such goods, per unit, may constitute 
such a pair. The elasticity of physical output with reference to changes 
in labor costs per unit, and the flexibility of labor costs with reference 
to changes in output, may be of interest. The relative responsiveness 
(to cyclical forces, e.g.) of the prices of a given commodity at different 
distributive stages may be defined by coefficients of flexibility (price 
flexibility at one stage being measured with reference to price changes 
at another stage). 

Coefficients of the type cited are of limited significance unless they 
define persistent regularities in economic processes. There may be 
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rsistence of correlated rhythms of prices and quantities in their 
s; there may be persistence of trend rela- 
od. When the existence of continuing 
established, appropriately designated 
serve a highly useful pur- 


pe 
cyclical or seasonal movement 


tions during an extended peri 
uniformities of behavior can be 
coefficients of elasticity and flexibility can 
pose in the study of economic processes. 
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PRESENTING SEASONAL VARIATION TO THE 
BUSINESS EXECUTIVE 


ALAN 8. DoNNAHOE 
Chamber of Commerce, Richmond, Virginia 


In presenting the results of their analysis, business statis- 
ticians have failed to reach the full understanding of the aver- 
age business executive. Even with such a relatively simple 
concept as seasonal variation, the traditional forms of presen- 
tation may be greatly improved and clarified. 


ness executive. To the extent that results of such analysis are 
mysterious, they are ineffective. To the same extent, the business 
statistician has failed in his job. 

As a responsible staff member, the statistician should enjoy the full 
trust of his associates. But trust must be distinguished from credulity, 
which is defined as “belief on slight or uncertain evidence.” If the busi- 
ness executive does not understand the statistical results presented, 
they definitely represent uncertain evidence to him, no matter what his 
faith in the statistician. 

An excellent example may be found in one of the simplest of all 
statistical operations: adjustment for seasonal variation. This is a 
subject with which the business executive is intimately acquainted, 
and yet he is often mystified by the manner in which it is presented by 
statisticians, 


rae analysis is 2 mysterious subject to the average busi- 


TRADITIONAL FORMS OF PRESENTATION 


A brief example will show why the statistician has failed in his 
presentation of this relatively simple subject. Sales of a hypothetical 
department store are shown in Table 1. Traditional methods of pre- 
senting these data, including the adjusted dollar total and index num- 
bers on a fixed base, are shown in Table 2. Neither method will be 
clear to the average business executive. 

While he may have some sketchy grasp of the concept, the adjusted 
dollar figures will make little sense to the department store manager. 
He knows very well that his December volume is greatly in excess of 
his September volume. He must plan his stock and personnel in precise 
accord with this difference. To show these at the same general level 
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is to violate his most basic standards of belief. The index data are 
even more abstract and difficult to reconcile with his everyday knowl- 
edge of sales volume in dollars and cents. 

If any statistician doubts that the average executive is confused by 
this type of presentation, he may carry out a simple experiment 
with any of the group at hand. Upon seeing such a report, the odds 
are very great that the excutive will continue his old habit of compar- 
ing a given month with the same month of the preceding year. This is 
his own tried and true method of eliminating seasonal variation and 
he will not discard it lightly. He naturally arrives at the same result as 
with unadjusted data. The whole process of adjustment has been worse 
than useless by adding confusion, but no accuracy, to the usual simple 
procedure. 

When he compares a given month with the same month of a preced- 
ing year, the executive realizes that his result is a mixture of unusual 
conditions that may have prevailed in both periods. While his primary 
concern is with the current period, this simple method gives him a 
compound result of the two. It is the statistician who should separate 
these influences for individual analysis, but this he has failed to do in 
understandable terms. 


TABLE 1. BASIC DATA 


ECONOMY DEPARTMENT STORE SALES 
In Thousands of Dollars 





T Sales | 
otal Sales | Seasonal 











T 
Month | 

| 1945 | 1946 | oe 
Jan. | 751 992 90 
Feb. | 853 1,055 | 95 
Mar. 950 1,105 100 
Apr. 1,107 | 1,154 | 105 
May | 1,203 1,213 | 110 
June 1,209 | 1,177 105 
July 1,054 1,010 90 
Aug. 956 941 85 
Sept. 853 76 80 
Oct. 957 1,097 100 
Nov. 981 | 1,203 110 
Dec. 1,126 1,431 130 
Year 12,000 13,254 100 
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TABLE 2. TRADITIONAL FORMS OF PRESENTATION 


ECONOMY DEPARTMENT STORE SALES 
Adj. for Seasonal! Variation 
































| Adjusted Sales In Thousands of Dollars Index: 1935-39 = 100 
Month 

1945 1946 1945 1946 
Jan. | 834 1,102 139 184 
Feb. 898 1,111 150 185 
Mar. 950 1,105 158 184 
Apr. 1,054 1,099 176 183 
May 1,094 1,103 182 184 
June 1,151 1,121 192 187 
July 1,171 1,122 195 187 
Aug. 1,125 1,107 187 184 
Sept. 1,066 1,095 178 182 
Oct. 957 1,097 159 183 
Nov. | 892 1,094 1] 149 182 
Dec. 866 1,101 | 144 183 
Year | 1,005 | 1,105 } 167 184 








SUGGESTED FORM OF PRESENTATION 


One solution to the problem will be found in Table 3, using the same 
hypothetical data as before. Let us examine this presentation as it 
would be explained to the executive for whom it is designed. 

Column 1. Actual dollar sales for 1945. This offers no difficulty. 

Column 2. Dollar sales for 1945 as they would have been if distributed in 
the usual seasonal pattern. This is easily explained. 

Column 3. Actual dollar sales for 1946, which offer no difficulty. 

Column 4. Percentage change in actual dollar sales, the usual form of 
analysis in business. 

Column 5. Current year compared with preceding year, in percentage 
form, with unusual influences ironed out of the latter, and with seasonal in- 
fluence excluded as well. 

The final column is simply the seasonally adjusted index of sales, 
with the preceding as the base year, in percentage rather than index 
form. There is no difference except in manner of presentation. The 
traditional form of presentation is mysterious. The form shown is 
simple, the mechanism is laid bare, and the meaning is clear. 

Our executive can readily understand, indeed he has a very keen 
appreciation of the fact, that some months in the preceding year 
were above or below that expected from the total volume of sales for 
the year. The suggested form of presentation shows the exact extent of 
that deviation in simple fashion. 
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TABLE 3. SUGGESTED FORM OF PRESENTATION 


ECONOMY DEPARTMENT STORE SALES 
In Thousands of Dollars 









































1945 1946 Per Cent Change 
Month Actual 
Actual Normal Actual Normal 

Jan. 751 900 992 32.1 10.2 
Feb. 853 950 1,055 23.7 11.1 
Mer. 950 1,000 1,105 16.3 10.5 
Apr. 1,107 1,050 1,154 4.2 9.9 
May 1,203 1,100 1,213 8 10.3 
June 1,209 1,050 1,177 — 2.6 12.1 
July 1,054 900 1,010 — 4.2 12.2 
Aug. 956 850 941 | — 1.6 10.7 
Sept. 853 800 876 2.7 9.5 
Oct. 957 1,000 1,097 14.6 9.7 
Nov. | 981 1,100 1,203 22.6 9.4 
Dec. 1,126 1,300 1,431 27.1 10.1 
Year |} 12,000 | 12,000 | 13,254 | 10.5 | 10.5 





The executive is dealing with familiar units in a familiar form. 
There are no abstract index numbers, no terms such as adjusted for 
seasonal variation with a connotation of mystery. 

Finally, he can continue his usual practice of comparing similar 
periods in different years. This is a process that he clearly under- 
stands and readily applies. But there is the vast difference that he can 
now see in quantitative terms exactly what is involved in this com- 
parison: unusual conditions in the preceding as well as in the current 
period. 


SUMMARY 


To present results in a way that will be readily understood and used, 
the business statistician must reorient his thinking in terms of the 
user. He must avoid abstract numbers and technical jargon. He must 
take full advantage of familiar forms of analysis in business, such as the 
comparison of similar periods in different years. 

Statisticians may well afford to give more thought to methods of 
presentation. What has been attempted here, with a relatively simple 
concept, should be extended to the more elaborate techniques. More 
lucid reporting is a real prerequisite to widespread adoption of sta- 
tistical research in business, a matter of interest to the entire statistical 
profession. 








SOME APPLICATIONS OF MULTIVARIATE ANALYSIS 
TO ECONOMIC DATA! 


GERHARD TINTNER 
Iowa State College, Ames, Iowa 


HIS essay proposes to introduce the economic statistician to some 
Lee the newer methods of multivariate analysis.? The emphasis will 
be on methods of estimation and not on tests of hypotheses. Tests of 
significance will be indicated where they have been established. 

Estimation by means of multivariate analysis presents certain gen- 
eralizations of the methods of multiple regression, as for instance pre- 
sented in Ezekiel’s* book. These analogies will be emphasized later in 
the course of the discussion of the various procedures. We will deal 
only with the following methods: 

(1) Discriminant analysis: Here we propose to determine linear 
functions or “indexes” computed from various measurable characteris- 
tics of certain data. The data have been classified into two groups. 
Discriminant analysis tries to establish linear functions of the charac- 
teristics which are such that they distinguish most successfully be- 
tween these groups. This method was invented by R. A. Fisher. A test 
of significance utilizes earlier work of Harold Hotelling. 

(2) Principal components: We try to answer the following question: 
Is it possible to analyze a set of variables into a more fundamental set 
of components (“factors”) possibly fewer in number? Which portion 
of the total variance can be accounted for by each component? The 
best method in this field is due to Harold Hotelling. 

(3) Canonical correlation: Assume we have two sets of variables. 
How can we determine linear combinations (“indexes”) of the variables 
in each set in such a fashion that the correlation between the indexes 
becomes a maximum? This method is due to Harold Hotelling. 

(4) Weighted regression: Assume that we have a set of variables all 
of which are subject to disturbances (“errors”). How can we find a 
weighted linear regression function which will give us the “best” esti- 
mates of the weighted regression coefficients? This method, evidently 
closely related to classical multiple regression analysis, is in its present 
form due to Tjalling Koopmans. It can also be used to answer a ques- 

1 The author is greatly obliged to his colleegues, Prof. W. G. Cochran, J. Nordin, O. Brownlee 
and A. M. Mood for help and criticism with this paper. He is also very much indebted to Prof. H. Ho- 
telling (Columbia) and the following members of the Cowles Commission (Chicago): J. Marschak, 
T. Koopmans, L. R. Klein and L. Hurwicz. Journal paper No. J-1373 of the Iowa Agricultural Experi- 


ment Station, Ames, Iowa, Project No. 730. 
2? A summary of some of the methods is given in: 8S. 8. Wilks: Mathematical Statistics, Princeton, 


1943, pp. 252 ff. 
3M. Ezekiel: Methods of Correlation Analysis, 2nd ed., New York, 1941. 
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tion previously raised by Ragnar Frisch: How many linear relation- 
ships exist probably between the variables (multicollinearity)? 

In what follows we propose to discuss these methods briefly and with 
a uniform notation. We will try to avoid lengthy mathematical deduc- 
tions and presentation of numerical methods. These can easily be ob- 
tained in the literature which will be quoted below. No effort has been 
made to give a complete survey of the literature. 

Some examples previously given by other authors will be summarized 
and new examples will also be presented. These examples are supposed 
to indicate the wide range of problems to which the methods can be 
applied. It should be remembered that these examples are only tenta- 
tive applications of the various methods and should be regarded merely 
as illustrations. It is to be hoped that they will stimulate more exten- 
sive applications in the economic field. 

The data which we use in our examples are time series. But we have 
neglected almost entirely this particularity of the data and the difficul- 
ties connected with it. This introduces possibly some biases into the 
tests of significance because of the serial correlations® probably existing 
in the data. The problem of degrees of freedom in economic time series 
has been treated by H. T. Davis.* No use has been made of these and 
similar methods. Presence of serial correlation makes the estimates 
inefficient. But the loss of efficiency is not very considerable if the serial 
correlation is not too large. It should be remembered, however, that 
the tests of significance, where they are given, may be influenced by ex- 
isting serial correlation in the variables. 

Another obvious shortcoming of the methods presented below is the 
fact that they all assume essentially linear relationships existing in the 
population corresponding to the sample. It is to be hoped that this dif- 
ficulty can be overcome later and that analogous methods will be de- 
veloped to deal with nonlinear cases. We may, for instance, use squares, 
cross products and higher powers etc. of the variables. 

1. Notation. Throughout this paper we will carry on our argument 
in terms of the sample, but always with the end in view to establish es- 
timates for the relationships existing in the population corresponding 
to the sample. 

Let X;.(i=1, 2, - - -, p, t=1, 2, - - -, N) be aset of random variables. 
The observations in this sample correspond to a normally distributed 


4See G. Tintner: “The Analysis of Economic Time Series,” Journal of the American Statistical 
Association, Vol. 35, 1940, pp. 93 ff. 

5 R. L. Anderson: “Distribution of the Serial Correlation Coefficient,” Annals of Mathematical 
Stotistics, Vol. 13, 1942, pp. 1 ff. 

* H. T. Davis: Analysis of Economic Time Series, Bloomington, Ind., 1941, pp. 175 ff. 
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multivariate population. We assume that each of the variables 
X,--+ +X, has been observed at N pointst=1,2,---,N. 
Denote by 


N 

(1) Xi = >) Xi/N (¢ = 1,2,---, p) 
tl 

the sample means of the p variables. Then: 

(2) tin = Xe — Xj (@=1,2,---,p,t=1,2,---, WN) 


are the deviations from the means. The sums of squares and products 
are: 


N 
(3) Si; = Zz Lik; (2,7 = 1, 2, wile DP); 


te] 
and the sample variances and covariances: 
(4) ayy = Sii/(N — 1) (t,j = 1,2,---, p). 
The sample correlation coefficient between X; and X; is: 
(5) rig = O4;/V 04:0; (2,7 = 1,2,---, p). 


Finally the standardized variables, deviations from the sample means 
expressed in terms of their standard deviation +/a;; are: 


(6) Zin = 2i/V ai @=1,2,---,p,¢=1,2,---, N). 


2. Discriminant Analysis. The first method discussed here is the 
method of discriminant functions introduced into statistics by R. A. 
Fisher.’ 

The problem to be solved is the following: Assume we have a set of 
measurements of a number of variables which are classified into two 
groups. Which linear combination of the various measurements will 
best discriminate between the two groups? 

Assume that we have N normally distributed observations on p vari- 


ables X; which we denote by Xx. (¢{=1, 2,---, p, t=1, 2,--- WN). 
Classify these into two groups for t=1, 2,---, N: and t=N,+1, NW; 
+2,---,NitN2=N. We define the means in each group: 
— M1 ane N 
(7) Xj = 7 Xit/ Ni; Xi = Zz. X it/ No. 
t=1 t=N,+1 


7 R. A. Fisher: “The Use of Multiple Measurements in Taxonomic Problems,” Annals of Eugenics, 
Vol. 7, 1936, pp. 179 ff. See also: “The Statistical Utilization of Multiple Measurements,” ibid., Vol. 8, 
1938, pp. 376 ff. Statistical Methods for Research Workers, 8th ed., London 1941, section 49.2, pp. 279 ff. 
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Let the differences of the means be: 


(8) d; = X** — X* (i = 1,2,---, p). 
We want to find the linear function of the differences of the means: 
(9) Z = kid; + ked2 +--+ + hyd, 


which discriminates most successfully between the two sets of varia- 
bles; this is to say whose square is a maximum relative to its variance. 
Maximizing the square of (9) under the condition that its variance is 
constant we get e.g. the following set of equations for the k;: 


Suki + Spoke +--+ + Sipky = d, 


(10) Suki + Snke +--+ + Sok, = de 


Sipki + Sapka +--+ + Sppkp = dy 


The solutions k; are proportional to the estimates of the coefficients 
of the linear function which in the population corresponding to the 
sample discriminates best between the two groups in the sense defined 
above. The similarity of the system of equations (10) and the normal 
equations in multiple regression analysis should be noted. 

A test of significance has been indicated by R. A. Fisher which 
makes use of Hotelling’s generalized Student distribution.* This dis- 
tribution was already derived in 1931. Define a quantity analogous to 
the multiple correlation coefficient: 





(11) R? = NiN2o(kidi + + - - + k,d,)/N. 
Then the variance ratio 
N — p — 1)R? 
(12) es ( p — 1) 
p(l — R*) 


has Snedecor’s F distribution for n1=p and nz==N—p—1 degrees of 
freedom. In this fashion we can test the hypothesis that the empirical 
discriminant function may have arisen out of pure chance, if in reality 
there is no difference at all between the variates in the two groups in 
the population. This test is also related to some work of the Indian 
school of statistics.°® 


8 H. Hotelling: “The Generalization of Students Ratio,” Annals of Mathematical Statistics, Vol. 2, 
1931, pp. 360 ff. 

® See, e.g., P. C. Mahalanobis: “On Generalized Distance in Statistics,” Proceeding of the National 
Institute of Science of India, Vol. 12, 1936, pp. 49 ff. R. C. Bose and S. N. Roy: “The Distribution of 
the Studentised D*-Statistic,” Sankhya, Vol. 4, 1938, pp. 19 ff. 
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Example 1 


The method has been applied in a most interesting way by Mr. David 
Durand to financial data. He utilized it e.g. to discriminate between 
good and bad loans.’° 

Let X, be the down payment, X, the price, X; the monthly income 
(all in dollars) and X, the length of the contract in months. Then a 
linear function has been determined which in a sample of 484 good and 
485 bad loans, discriminates best between good and bad loans: 


(13) Z = X; — 0.174X2 + 0.124X3 — 6.45X,. 


This method may also be applied in order to classify various eco- 
nomic phenomena. For instance, a group of prices called sensitive prices 
was frequently used in an attempt to anticipate more general price 
movements. The question if a given price should be included into this 
group could be decided by finding a set of relevant measurements for 
each of a number of sensitive and non-sensitive commodities and then 
computing the linear combination of the measurements which dis- 
criminates most successfully between sensitive and non-sensitive prices. 
This discriminant function can then be used in order to classify a given 
price into one or the other of the two groups. 

A similar problem is the classification of prices into prices of con- 
sumers’ goods and producers’ goods which we propose to illustrate by 
an example. 

Example 2 


We have tried to apply the methods of discriminant analysis to the 
following problem: Is it possible to distinguish between the prices of 
producers’ goods and the prices of consumers’ goods on the basis of 
certain measurements connected with their behavior during the busi- 
ness cycle? We are going to use some data collected earlier in a previous 
book of the author." We will use monthly English wholesale prices, 
taken from the period 1860-1913. The seasonal and trend have been 
eliminated from these series by a system of moving averages. 

We denote by X; the median length of the cycle in months. This is 
the median of all cycles in the period, measured from minimum to mini- 
mum. 

X> is the median percentage of the duration of cyclically rising prices 
relative to the total duration of the cycle. 


10D. Durand: Risk Elements in Consumer Installment Financing, Financial Research Program, 
Studies in Consumer Installment Financing No. 8 National Bureau of Economic Research, New York, 
1941, pp. 125 ff. 

ul G. Tintner: Prices in the Trade Cycle, Vienna, 1935, Table 2, pp. 110 ff. 
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X; is the median cyclical amplitude expressed as percentage of the 
trend. 

X, is the mean monthly rate of change in the cycle (percentage of 
trend per month). 

The fact that the various measurements are given in different units 
is irrelevant, since the discriminant function is invariant for linear 
transformations. Our result would not be affected if e.g. X: was given 
in years instead of months. 

We will try to construct a kind of “index” which will best discrimi- 
nate between consumers’ goods and producers’ goods on the basis of 
the measures of cyclical behavior indicated above. If we can do this, 
we would have a method which in a sense would measure most effi- 
ciently the “cyclical distance” between prices of various commodities.” 

The linear discriminant function will be in our case: 


(14) Z = kX, a koXe a k3X3 aa kX. 


TABLE 1 
CYCLICAL MEASUREMENTS 











Price xX: X: Xs Xs Z 
Consumers’ Goods 

Rice 72 50 8 0.5 0.186 
Tea 66.5 48 15 1.0 0.224 
Sugar 54 57 14 1.0 0.200 
Flour 67 60 15 0.9 0.228 
Coffee 44 57 14 0.3 0.183 
Potatoes 41 52 18 1.9 0.207 
Butter 34.5 50 4 0.5 0.098 
Cheese 34.5 46 8.5 1.0 0.128 
Beef 24 54 3 1.2 0.076 
Average 

xX.* 48.611 52.667 11.056 0.922 0.170017 

Producers’ Goods 

Gasoline 57 57 12.5 0.9 0.194 
Lead 100 54 17 0.5 0.293 
Pig Iron 100 32 16.5 0.7 0.283 
Copper 96.5 65 20.5 0.9 0.315 
Zine 79 51 18 0.9 0.266 
Tin 78.5 53 18 1.2 0.266 
Rubber 48 50 21 1.6 0.238 
Quicksilver 155 44 20.5 1.4 0.404 
Copper Sheets 84 64 13 0.8 0.243 
Iron Bars 105 35 17 1.8 0.298 
Average 

xX; 90.30 50.50 17.40 1.07 0.279983 
General Average 

x: 70.553 51.526 14.395 1.000 0.227871 
Difference 

dj 41.689 —2.167 6.344 0.148 0.109921 








12 H, Hotelling: “Spaces of Statistics and their Metrization,” Science, Vol. 47, 1928, pp. 149 ff. 
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This discriminant function should become a maximum while its 
variance is constant. 

We have chosen 19 prices. The various measurements for the data 
are indicated in Table 1. 

The matrix of the sums of squares and products computed from the 
data given above is presented in Table 2. 


TABLE 2 
SUMS OF SQUARES AND PRODUCTS 











} xX Xs Xs X. | 
I 

xX || 18,382.456 ~1,350.966 1,833.410 21.393 

Xs 1,275.349 45.623 ~18.794 

Xs 495.146 16.345 

X 3.460 








Only the elements above the diagonal are given since the matrix is 
symmetrical. 

The system of equations to determine our estimates k; is taken from 
Table 2 and the d; from the last line of the previous table: 


18,382.456k, — 1,350.966k2 + 1,833.410k; + 21.393k, = 41.689 
— 1,350.966k, + 1,275.349k, — 45.623k; — 18.794k, = — 2.167 
1,833.410k, — 45.623k2 + 495.146k; + 16.345k, = 6.344 
21.393k,; — 18.794k, + 16.345k; + 3.460k, = 0.148. 


The solutions are indicated in the following linear discriminant func- 
tions: 
(16) Z = 0.001605X, + 0.000277X. + 0.006825X; + 0.002115X.. 


The meaning of the function (16) is as follows: If Z is larger than at 
the general mean (C.227871), a commodity should be classified as a 
producers’ good, in the opposite case as a consumers’ good. The average 
Z for producers’ goods is 0.279983 and for consumers’ goods 0.170017. 
Only one consumers’ good (flour) and one producers’ good (gasoline) 
are misclassified. 

The values of Z for various commodities and also for the averages 
are indicated in the last column of Table 1. It is interesting to note that 
in this function Z (16) the largest weight has been given to X;3 (ampli- 
tude). This seems to indicate that the cyclical amplitude is possibly 
more important than other characteristics in distinguishing consumers’ 
and producers’ goods. 

We compute R? from the last line of Table 1 as R? = (90) (0.109921) / 
19=0.520678. The following is the variance ratio: F =3.802. This is 


(15) 
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clearly significant. We require at the 5 per cent level of significance for 
4and 14 degrees of freedom an F of only 3.11. But at the 1 per cent level 
an F of 5.03 is required. The null hypothesis that our discriminant func- 
tion could have arisen by pure chance is refuted by the test. (This test 
could also have been made by Hotelling’s methods without first com- 
puting the discriminant function.) Hence it is likely that in the popula- 
tion some difference probably exists in the cyclical behavior of the two 
groups. We would conclude that there is an effective linear combination 
of the cyclical measures indicated above which distinguishes success- 
fully between consumers’ and producers’ goods on the basis of the data 
used. It is interesting to note that if another consumers’ good—namely 
pepper, is included into the analysis we do not achieve significant re- 
sults. 

Our result is possibly of some economic importance. It should be in- 
terpreted in the light of the obvious limitations of our methods in deal- 
ing with this problem: The characteristics indicated in Table 1 are 
probably not really normally distributed in spite of the fact that the 
median in large samples tends under certain conditions to be normally 
distributed. It is also possible that a non-linear combination of the 
characteristics would be more adequate in our case. 

If our results were more trustworthy and also based upon a larger 
sample covering a longer period we could draw more reliable conclu- 
sions. We may still tentatively say that our analysis seems to sup- 
port to a certain degree the contentions of the majority of business 
cycle theorists. 

8. Principal Components. The method presented here was first de- 
vised by Hotelling™ to deal with a problem appearing in factor analysis 
in psychology :* How can we analyze a group of variables into a set of 
independent i.e. orthogonal components, called “factors?” 

Girshick has shown in an important article that the same method 
can also be applied to the solution of other problems: We have a set 
of variates, each of which consists of the sum of a systematic component 
and an error. How can we find a linear function of the variates which 
is least subject to the “errors”? Girshick also showed that the principal 
components method leads to maximum likelihood estimates if the vari- 
ates are normally distributed. 


1 H. Hotelling: “Analysis of a Complex of Statistica] Variables into Principal Components,” Jour- 
nal of Educational Psychology, Vol. 24, 1933, pp. 417 ff., 498 ff. See also S. S. Wilks: Mathematical 
Statistics, op. ci*., pp. 252 ff. 

“« K. J. Holzinger and H. H. Harman: Factor Analysis, Chicago, 1941. 

 M. A. Girshick: “Principal Components,” Journal of the American Statistical Association, Vol. 31, 


1936, pp. 519 ff. 
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Assume that we want to replace a set of standardized variables 
Z, ++ +2, by a more fundamental set of variables u,;--- uy. Let us 
define: 


Ryyts + hye +°++ + Kipp 


21 
(17) 
2, = kya oa k2Ue + eee +-. KppUp 


These u; - + - up are the principal components. We want the u; to 
Pp 
reproduce the original correlations between the variables 2;. 
The expression : 


(18) S; = ky? + hea? +--+ +k? (¢ = 1,2,---, p) 


is evidently the contribution of the 7-th principal component u; to the 
variances of all standardized variables z;. Maximizing (18) under the 
condition that the correlations between the original variables ought to 
be reproduced leads to the system of linear equations for the coefficients 
kj: 

kis + riokes + ++ + tipkps = Akai 
(19) 

Tipkis + Tapkes +++ + ky = Akos 


This system of linear equations is again similar to the normal equa- 
tions in multiple regression analysis. It can only have a non-trivial 
solution if the determinant is equal to zero. 


(l— i) Tee + Typ 
(20) ne ae Ce ee eh Se eS = 0. 
| Tip Top ++ + (1 — 4) 


It can be shown that the largest root of (20) is associated with the first 
principal component which accounts for most of the variance. 

Next assume that we have the variables z; consisting of two parts: 
The “true” value or mathematical expectation and a random error. We 
want to find a linear function: u=k,z;+kez2+ - + - +k,zp. This ought 
to be chosen in such a fashion that the variance of the errors is a mini- 
mum and that the variance of u is one. Girshick’* has shown this leads 
to the previous method. We must choose as \ the largest root of the 
determinantal equation (20). If we drop the second subscript we get 
exactly the same solutions as before. This second interpretation may be 
more useful than the original one in economic data. 


16 Loc. cit., pp. 522 ff. 
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Girshick” has also shown that the method of principal components 
results from the maximum likelihood approach if the original variates 
Xx: follow a normal multivariate distribution. Hence it follows that it 
provides estimates of the principal components in the population cor- 
responding to the sample which have certain optimum properties asso- 
ciated with maximum likelihood solutions. The results of this method 
need not, however, be meaningful in economic terms. 

The distribution of the latent roots of the determinantal equation 
(20) has been established by various authors.'* 

Use has been made recently by M. J. Hagood'® and E. H. Bernert 
of the method of principal components in the field of sampling of eco- 
nomic data. 

The most important class of problems to which it could be applied 
are perhaps those connected with statistical questions arising from the 
transition from micro-economic to macro-economic analysis.*° These 
questions have been discussed from the point of view of economic 
theory,” but never verified statistically by the use of valid methods. 

The practical importance of a solution of this problem lies in the 
following: Many questions of economic policy require a knowledge of 
the broad economic relationships which are discussed in economic 
theory under the name of general equilibrium. This is true for instance 
of problems of full employment, taxation, subsidies, etc., which ought 
to be discussed in the most general terms possible. It is obviously im- 
possible to verify statistically a true general equilibrium because of the 
great number of variables involved. It would be necessary to include 
all prices of all commodities, all quantities of all commodities produced 
and consumed, all interest rates, etc. It is obvious that such a procedure 
would literally involve thousands and possibly millions of variables, 

Hence it seems to be necessary to substitute certain indexes for 
groups of these variables. We may want for instance to represent all 
wholesale prices by an index of wholesale prices, all quantities pro- 
duced by an index of production etc. Which particular indexes will 

17 Loc. cit., pp. 527 ff. 

18 M. A. Girshick: “On the Sampling Theory of the Roots of Determinantal Equations,” Annals of 
Mathematical Statistics, Vol. 10, 1939, pp. 203 ff. R. A. Fisher: “The Sampling Distribution of Some 
Statistics Obtained from Non-Linear Equations,” Annals of Eugenics, Vol. 9, 1939, pp. 238 ff. P. L. 
Hsu: “On the Distribution of Roots of Certain Determinantal Equations,” Jbid., pp. 250 ff. S. S. Wilks: 
Op. cit., pp. 261 ff. (unpublished results by A. M. Mood). 

19M. J. Hagood and E. H. Bernert: “Component Indexes as a Basis for Stratifying a Sample,” 
Journal of the American Statistical Association, Vol. 40, 1945, pp. 330 ff. 

2” R. Frisch: “Propagation Problems and Impulse Problems in Dynamic Economics,” Economic 
Essays in Honor of Gustar Cassel, London, 1933, pp. 171 ff. M. Kalecki: “A Macrodynamic Theory of 
Business Cycles,” Econumetrica, Vol. 3, 1935, pp. 327 ff. 

2! See, e.g., O. Lange: Price Flexibility and Employment, Bloomington, Indiana, 1944, pp. 103 ff. 


#2 G. Tintner: “Multiple Regression for Systems of Equations,” Econometrica, Vol. 14, 1946, pp. 
5 ff., esp. pp. 6-9. 
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be chosen depends of course upon the nature of the economic problem 
considered. But it is of interest to establish the statistical validity of 
these indexes in the following sense: How perfect is the representation 
of all the various prices, for instance, by some general price index? 
Which percentage of the variance of various quantities produced in 
the economy is accounted for by a certain production index? We be- 
lieve that questions of this nature can be answered tentatively by the 
method of principal components. 

These problems are evidently connected with the general problem 
of index numbers.” They have very far-reaching significance for the 
choice between several possible macro-economic models and their em- 
pirical validity, if we strive for econometric applications of these mod- 
els. 

The very efficient computational methods developed by Hotelling* 
have been utilized in the following examples: 


Example 3 


The first example deals with an attempt to determine the principal 
components of a set of production indexes. This example is somewhat 
related to an earlier essay of E. C. Rhodes. 

Denote by X; an index for the production of manufactured durable 
goods, X2 of non-durable manufactured goods, X3 of minerals and X, 
of agricultural products. All indexes are computed with the base 
1935-39 = 100. The period covered is 1919-39. We use annual figures. 
The indexes X, Xo, X; are taken from the publications of the Federal 
Reserve Board and X, from the year-book of Agricultural statistics of 
the Department of Agriculture. The correlation matrix is given in the 
following table: 

TABLE 3 
CORRELATION MATRIX 











XxX, i] 1.000000 0.495941 0.872836 0.481240 

X: 1.000000 0.768279 0.709807 | 
Xs i 1.000000 0.718358 
Xo | 1.000000 | 








These four variables can be analyzed into various components or 
factors. 


33 See, e.g., G. von Haberler: Der Sinn der Indexzahlen Tuebingen, 1927. R. Frisch: “Annual Survey 
of Economic Theory: The Problem of Index Numbers,” Econometrica, Vol. 4, 1936, pp. 1 ff. 

*% H. Hotelling: “Simplified Calculation of Principal Components,” Psychometrika, Vol. 1, 1936, 
pp. 27 ff. 

% E. C. Rhodes: “The Construction of an Index of Business Activity,” Journal of the Royal Statis- 
tical Society, Vol. 100, 1937, pp. 18 ff. 
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We want to find the principal component which is such that it con- 
tributes most to the variances of the standardized variables 2; - - + 24. 
The system of linear equations which yields the coefficients of the 
first and largest principal component is: 
1.000000ky, + 0.495941k2 + 0.872836ks: + 0.481240ka 
0.495941ky, + 1.000000k2 + 0.768279ks, + 0.709807ka 
0.872863ki + 0.768279k2, + 1.000000k3, + 0.712358ky = Aku 


II 


Aku 
Ake 


This system of equations can only have non-trivial solutions if its de- 
terminant becomes zero. The determinantal equation becomes: 


1.000000—2 0.495941 0.872836 0.481240 
0.495941 1.000000—2 0.768279 0.709807 


| 0.872836 0.768279 1.000000—A 0.712358 | 


| 0.481240 0.709807 0.712358 1.000000—A | 


From this we have \ =3.033424. The contribution of the first principal 
component to the variance of the standardized variables are the squares 
of: ki; =0.817391, ko: =0.888102, k3;=0.951934, ky: =0.818776. Hence 
it follows that the first principal component “explains” about 67 per 
cent of the variance of z;, about 79 per cent of the variance of 22, about 
91 per cent of the variance of z; and about 67 per cent of the variance of 
24. 

The same results can also be used to exemplify the other approach. 
We assume now that each variable z; consists of a systematic part, 
the mathematical expectation, and a random component. The func- 
tion u which minimizes the error variances (while its own variance is 
one) is: 

(23) w = 0.2694622, + 0.292772z, + 0.3138152z; + 0.269918z,. 


The coefficients in (23) are proportional to the previous ones. It is in- 
teresting to note that minerals have the greatest weight. 

The total variance of the four standardized variables is evidently 4. 
Hence, since \ =3.033424 it appears that the first principal component 
“explains” about 76 per cent of the total variance of the standardized 
variates 21 ++ + 24. 

The economic interpretation of these results would appear to be as 
follows: There existed in all probability during the period considered 
in the American economy a phenomenon like “production in general.” 
This general “factor” would account for more than of the total vari- 
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ance of the individual production indexes. A more detailed analysis 
should of course be carried out to confirm this result not only with re- 
spect to the broad categories of production used here but also apply it to 
the production of individual commodities. 

This result, while not unexpected, is by no means trivial. It would be 
possible, for instance, to imagine an economy where the industrial 
sector and the agricultural sector have very little relationship. Then 
we would have two important factors, say, industrial production and 
agricultural production. The first named would probably account for 
most of the variance in X,, Xo, and X; and the second for most of the 
variance of X4. This is obviously not the case in our example. The 
factor “production in general” which we have indicated in (23) ac- 
counts for most of the variance of X,4 as well as for most of the variance 
of all other variables. 

Example 4 


A second example which will not be presented in such great detail 
deals with prices. Denote by Xs an index of wholesale farm prices, by 
X, an index of wholesale food prices, by X7 an index of all other whole- 
sale prices. These are taken from the Bureau of Labor Statistics indexes 
for the period 1919-1939. The base year of the indexes is 1926. The 
indexes are given annually. We want again to find the first principal 
component which accounts for most of the variances of the standardized 
variates. 

An analysis of the data reveals that the contributions of the first 
principal component to the variance of each standardized variable 
Zs, 26, 27 is the square of the corresponding coefficient: ks; =0.986867, 
ke: =0.990160 and k7,=0.957621. It appears that the first principal 
component accounts for about 97 per cent of the variance of 2s, for 
about 98 per cent of the variance of zs and about 92 per cent of the vari- 
ance of 27. 

The function u which minimizes the variance of the random errors 
(while its own variance is one) is: 


(24) u = 0.343693 z5 + 0.344845 z. + 0.333508 27. 


The coefficients in (24) are again proportional to the k’s indicated 
above. 

It is remarkable to note that here the weights given to the various 
variables are approximately the same. The greatest root of the deter- 
minantal equation (20) is here \ = 2.871360. Since the total variance of 
the standardized variables is 3, the principal component can be said 
to account for more than 95 per cent of the total variance. 
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It is of some interest to correlate our “index” (24) with the All Com- 
modities Wholesale Price Index computed by the Bureau of Labor 
Statistics. The resulting correlation coefficient is 0.991 and highly sig- 
nificant for 19 degrees of freedom. 

Hence we would conclude that on the basis of the evidence presented 
it seems that a “general” price index (24) could very well explain most 
of the variability of the price indexes of groups of commodities. It ap- 
pears that the residual variability is really almost negligible. This result 
should, however, be checked by an analysis of the prices of individual 
goods rather than of broad price categories like those used in our own 
procedure. 

Again this result is what we should expect. But it is by no means as 
obvious as it seems. We could again for instance imagine an economy 
in which the industrial and the agricultural sectors have very little 
connection. Then we would have to distinguish two factors instead of 
one, say industrial prices and agricultural prices. The first factor would 
account for most of the variance of X7 and the second for most of the 
variances of X, and X¢. This is obviously not the case in the American 
economy. The index indicated in (24) which represents general price 
movements accounts never for less than 90 per cent of the variance of 
any among our variables. 

4. Canonical Correlations. In economic statistics we desire sometimes 
to find the relationship between sets of variables. The method of canoni- 
cal correlations, introduced into statistics by Hotelling,®** provides 
means for accomplishing this. We replace each of the two sets of vari- 
ates by a linear combination of the variates contained in each set, the 
canonical variate. Then we endeavor to maximize the correlation be- 
tween these two canonical variates, the canonical correlation. 

Assume we have p variables X; and N observations on each variable. 
The variables are divided into two groups: i=1, 2,---+, p’ and 
t=p'’+1, p’+2,---,p. 

We want to find two linear functions: 


(25) U = kX. + heX2+--+- + hy XX, 
and 
(26) V = hyryrX pra + kpyeX pre +++ + hpX>. 


which have maximum correlation with each other. The variances of 
U and V are supposed to be equal to one. The canonical correlation 
coefficient between U and V becomes: 


* H. Hotelling: “Relations between Two Sets of Variables,” Biometrika, Vol. 28, 1936, pp. 321 ff. 
See also: S. S. Wilks: Mathematical Statistics, op. cit., pp. 257 ff. 
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p’ Pp 
(27) R=). > akk; 

t=1 j=p’+1 
and this is to be made a maximum under the conditions that the two 
variances are equal to one. 

Maximizing (27) under conditions that the variances of U and V are 
one we are led to a system of linear equations. These linear equations 
show again a certain similarity to the normal equations in classical 
multiple regression analysis. With the help of various transformations 
which have been given explicitly by Hotelling and need not be repeated 
here, we are finally led to a determinantal equation: 


[fu—d* fr--- fip 
fon for guhthc frp’ = 


The f;; are certain functions of the variances and co-variances 4,;. 
Equation (28) determines 7, the square of the maximum canonical 
correlation coefficient. We take the largest root of the determinantal 
equation (28). The joint distribution of the roots of this equation has 
been found by various authors,”? under the hypothesis that their popu- 
lation value is zero. Standard errors have been derived earlier by 
Hotelling. Inserting the value of \ into the system of linear equations 
we find k, - - - k,. These provide estimates of the canonical variates in 
the population corresponding to our sample. These canonical variables 
are such that U is most successful in predicting V and V the best pre- 
dictor of U. 

It should perhaps be emphasized that these methods do not neces- 
sarily yield results which can be readily interpreted in terms of eco- 
nomic theory. This problem will be discussed in greater detail in the 
last section of this paper. 

Hotelling?* applied canonical correlation first to some pyschological 
data taken from T. L. Kelley. But he indicated the possibility of apply- 
ing this method to certain economic problems e.g. the effect of crops 
of agricultural products on their prices, etc.?® 

The two most successful attempts to apply these methods to eco- 
nomic data have been made by F. V. Waugh.*° He studied (1) the rela- 
tion between consumption and prices of various types of mean and(2) 

27 R. A. Fisher: “The Sampling Distribution of some Statistics Obtained from Non-Linear F.qua- 
tions,” Annals of Eugenics, Vol. 9, 1939, pp. 238 ff. P. L. Hsu: “On the Distribution of Roots of Certain 
Determinantal Equations,” ibid., pp. 250 ff., S. S. Wilks, op. cit., pp. 265 ff. 

% Loe. cit., pp. 342 ff. 


2% Loc. cit., p. 322, p. 376 f. 
0 F. V. Waugh: “Regressions Between Sets of Variables,” Econometrica, Vol. 10, 1942, pp. 290 ff. 
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the relation between characteristics of wheat and characteristics of 


flour. 
Example & 

We indicate the first analysis as follows: Let X, be steer prices and 
X, hog prices, X; beef consumption and X, pork consumption. Then 
the two canonical variates are: U=1.71117 X,+1.54037 X,. for the 
prices and V =5.25679 X3;+15.45684 X, for the consumption. These 
canonical variates are chosen in such a fashion as to maximize the 
(canonical) correlation between U and V. This correlation turns out 
to be —0.84666. U is the most successful linear combination of the 
prices to predict V and V is the best linear combination of the con- 
sumption data for predicting U. 

Example 6 

In the other example given by Waugh, let the wheat characteristics 
be as follows: X; kernel texture, X2 test weight, X; damaged kernels, 
X, foreign materials, X, crude protein content. The flour characteris- 
tics are as follows: X¢ wheat per bbl. of flour, X; ash in flour, Xs crude 
protein in flour, X, gluten quality index. The canonical variate formed 
from the wheat characteristics is: U=0.03902 X, +0.23817 X, 
— 0.03172 X;—1.18545 X,+0.77554 Xs. The canonical variate formed 
from the flour characteristics is as follows: V = —0.11971 X,—13.12015 
X7+1.12464 Xs+0.05903 X >. The (canonical) correlation between 
U and V is 0.909388. This is the highest possible correlation between 
any linear combination of wheat and flour characteristics. U may be 
used to predict V and V is most successful in predicting U. 


Example 7 


In the following example we will try to determine the relationship 
between certain price indexes and some production indexes by the 
method of canonical correlation. The data are the following: 

X, is the index of production of manufactured durable goods, X¢ of 
nondurable goods, X; is the production index of minerals and X, the 
index for agricultural products. All these indexes are given annually for 
the base 1935-1939 = 100. They have been taken from the publications 
of the Federal Reserve Board except for X, which comes from the De- 
partment of Agriculture. These production indexes form the first group. 

The yearly price indexes, all given for the base 1926= 100 are taken 
from the publications of the Bureau of Labor Statistics. All are whole- 
sale prices. X; denotes farm prices, X, food prices and X; other prices. 
The period covered by all these indexes is 1919-1939. They are annual 
data given for 21 years. 
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The matrix of the correlation coefficients is represented in the follow- 
ing table: 











TABLE 4 
CORRELATION MATRIX 

| Xi Xs Xs X Xs Xs Xr | 

|} 
Xi i| 1.000000 0.495941 0.872836 0.481240 -—0.436385 -—0.427250 —0.203390 | 
X: 1.000000 0.768279 0.709807 0.425728 0.429576 0.584220 {| 
Xi 1.000000 0.712358 -0.038273 —0.043762 0.138680 | 
Xs 1.000000 0.261010 0.267098 0.378452 
Xs 1.000000 0.987285 0.904598 | 
Xe 1.000000 0.914394 ! 
X: 1.000000 |/ 





We want to find two linear functions (canonical variates): 


(29) U kia, + kore +" ksx3 + kur, 

(30) V = kexts + kere + kr2z. 

The variances of these two functions U and V should be one and their 
correlation a maximum. 


Using the iteration methods developed by Hotelling*® and the com- 
putation schemes of Waugh® we get the following results: 


(31) U = 1.094989z, — 0.371620z2. — 0.587650z; — 0.020964z, 
(32) V = 1.000000z; — 0.0114242, — 0.215485z,. 


These results are given in terms of the standardized variables 2,. 

The (canonical) correlation coefficient between these two linear func- 
tions of the quantities produced (U) and the prices (V) is 0.8831. U and 
V are chosen in such a manner that they have the highest possible cor- 
relation with each other. 

This can also be expressed in the following way: Our U is the linear 
combination of the various production indexes which is most successful 
in predicting the general price “index” V. And at the same time V is the 
linear combination of price indexes which is best in order to predict 
the general production “index” U. 

Needless to say our results are to be interpreted with a certain 
amount of caution. After all, we are dealing here only with production 
and price indexes for rather broad categories and not with the produc- 
tion and price data for individual commodities. We can nevertheless 
reach some tentative conclusions. Our method does not imply that 
we get necessarily economically meaningful (“structural”) relation- 
ships. 

The first index U (31) shows that in trying to estimate the mutual 


1 Op. cit., pp. 342 ff. 
32 Op. cit., pp. 301 ff. 
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interdependence between production and prices the largest weight 
has to be given to the production of durable goods. This agrees with 
the ideas of many students of the business cycle. The weight given to 
production of minerals is also quite important but negative. Agricul- 
tural products seem to play only a very insignificant part. It is espe- 
cially the weighted difference between the movements of the production 
of durable goods and the production of minerals which appears to be 
decisive. This points in the direction of certain business cycle theories, 
especially those stressing the different behavior of various producers’ 
and consumers’ goods in the cycle. 

Highest weight in V (32) is given to farm prices and some weight 
also to the index representing all other prices. But this weight is nega- 
tive. It is probably due to the fact that many miscellaneous prices are 
contained in this category that the influence is here considerably 
smaller. Food prices seem not to play a very important part in the 
determination of V. lt is interesting to note that it is the difference 
between farm prices and prices of other commodities which appears to 
be decisive. This again seems to point in the direction of certain busi- 
ness cycle theories. 

§. Weighted Regression. Ordinary multiple regression tries to esti- 
mate the relationship between a “dependent” variable and a set of “in- 
dependent” variables in such a way, as to make the prediction of the de- 
pendent variable most successful. The sum of squares of the deviations 
from a linear combination of the fixed values of the independent varia- 
bles becomes as small as possible (method of least squares). This as- 
sumes that we want to predict the dependent variable most success- 
fully for fixed values of the “predictors,” i.e. the independent variables.* 

This method evidently breaks down if we are not interested in pre- 
diction but only in the establishment of “structural” relationships ex- 
isting in the population, and if we also assume that all variables are sub- 
ject to disturbances. For theoretical purposes and also for purposes of 
economic policy this is most important, as Haavelmo has shown.™ 

Assume we adopt the following stochastic scheme: Not only the de- 
pendent variable but all variables in the system contemplated are sub- 
ject to error (Frisch).** We do not want to predict one of the variables 
for fixed values of the others, but want to estimate the structural rela- 
tionships themselves, i.e. the regression coefficients of the weighted 
regression equation. 

% H. Hotelling: “The Selection of Variates for Use in Prediction with Some Comments on the 
General Problem of Nuisance Parameters,” Annals of Mathematical Statistics, Vol. 11, 1940, pp. 271 ff. 

% T. Haavelmo: “The Statistical Implications of a System of Simultaneous Equations,” Econo- 


metrica, Vol. 11, 1943, pp. 1 ff. “The Probability Approach in Econometrics,” tbid., Vol. 12, 1944, Sup- 


plement. 
% R. Frisch: Statistical Confluence Analysis by Means of Complete Regression Systems, Oslo, 1934. 
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The estimation of structural relationships* is most important in con- 
nection with problems arising in economic policy. This will be illus- 
trated later especially with the help of Examples 10 and 11. We will, 
however, indicate here some of the oustanding features of this idea 
with the help of another illustration. 

Consider the market for a commodity, say, wheat. It is known from 
economic theory that the price of wheat and the quantity of wheat 
sold on the market are determined by the demand function and the sup- 
ply function of wheat. If we correlate quantity and price the result need 
not necessarily represent either the demand function or the supply 
function. This is irrelevant, however, as long as we assume that the 
fundamental conditions (tastes, technology, etc.) remain the same and 
we want only to make predictions. The classical regression of the price 
on the quantity will for instance under these conditions give the best 
prediction for the price. And the classical regression of the quantity on 
the price will be the most successful predictor for the quantity of wheat 
sold and bought on the market as long as there is no change in the fun- 
damental underlying conditions. 

But the situation is entirely different if, for instance, the govern- 
ment decides to fix the price of wheat. Then it becomes most important 
to know the elasticity of demand. But this elasticity cannot be estab- 
lished from any of the two classical regression equations, except in very 
exceptional cases.*7 Hence we need a method which will yield estimates 
of the important economic structural coefficients, like e.g. elasticities, 
themselves. Weighted regression is designed for this particular purpose 
rather than for the prediction of values of one particular variable like 
classical multiple regression. 

The problem of distinguishing the various economically meaningful 
relationships, e.g., demand functions and supply functions, is also very 
important. This problem of identification will be discussed in more de- 
tail in connection with Example 10 below. 

The method of weighted regression was developed by Koopmans* 
on the basis of earlier work of many authors, among whom Rodes* and 
van Uven*® are most important. 


2% A. Wald: “The Fitting of Straight Lines if Both Variables are Subject to Error,” Annals of Mathe- 
matical Statistics, Vol. 11, 1940, pp. 284 & 

37 E. J. Working: “What Do Statistical Demand Curves Show,” Quarterly Journal of Economics, 
Vol. 41, 1927, pp. 212 ff. 

38 T. Koopmans: Linear Regression Analysis in Economic Time Series, Haarlem, 1937. 

#9 FE. C. Rhodes: “On Lines and Planes of Closest Fit,” Philosophical Magazine, ser. 7, Vol. 3, 1927, 
pp. 357 ff. 

4M. J. van Uven: “Adjustment of N Points (in n-dimensional Space) to the best linear (n —1)- 
dimensional Space,” Koninklijke Akademie van Wetenschapen, Amsterdam Proceedings of the Section 
of Science, Vol. 23, 1930, pp. 143 ff., 307 ff. 
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In the most general form we can pose the problem in the following 
way: Assume that we have a meaningful (identified) linear relation- 
ship between the p economic variables M;: 


(33) ko + kM, + keoMe + es © a k,M, = W 


where ko, ki, - + +, kp are “structural” coefficients and w is a random 
term. It results from variables not included in our system. But actually 
we don’t observe the “true” variables M, - - - Mf, but the empirical 
variables X;, ({=1, 2,---, N). We have N observations on each 
variable. Let us assume that the systematic part /; is the mathematical 
expectation of X; and the y,, are the random disturbances. 


Xu = My + Yit 


X pt - My: + Ypt 


We assume that the “disturbances” or errors y; are independent of 
each other and normally distributed. They arise as errors of measure- 
ment, from lack of representativeness of the empirical variables Xj, 
from frictional causes, etc. It has been proposed to call the y, dis- 
turbances in the variables and the w disturbances in the equation. 

There are two possibilities in dealing with this situation represented 
by (33). We can either neglect the disturbances y;; or the random term 
w. This random term results from variables not included in the analysis 
and similar causes. The first approach is implicit in Haavelmo’s,® 
Wald’s* and Marschak’s** work. The second assumption underlies the 
fundamental scheme of Frisch** and the weighted regression analysis 
developed by Koopmans.“* We are going to deal only with the second 
case. Our equation (33) becomes: 


(35) ko thMu+t+hMyn+---+kMy =0 (t=1,2,---,N). 


We can only neglect w if all or at least most of the important variables 
have been included in our system. 


“LL. R. Klein: ‘Pitfalls in the Statistical Determination of the Investment Schedule,” Econo- 


metrica, Vol. 11, 1943, pp. 246 ff. 
42 Op. cit. See also: T. Koopmans: “Statistical Estimation of Simultaneous Economic Relations,” 


Journal of the American Statistical Associatien, Vol. 40, 1945, pp. 448 ff. 

* H. B. Mann and A. Wald: “On the Statistical Treatment of Linear Stochastic Difference Equa- 
tions,” Econometricc, Vol. 11, 1943, pp. 173 ff. 

“ J, Marschak and W. H. Andrews: “Random Simultaneous Equations and the Theory of Produc- 
tion,” Econometrica, Vol. 12 1944, pp. 143 ff. 

% Op. cit. See also: R. Stone: The Analysis of Market Demand, London, National Institute of Eco- 


nomic and Social Research, 1945. 
48 Op. cit. See also: G. Tintner: “An Application of the Variate Difference Method to Multiple 


Regression,” Econometrica, Vol. 12, 1944, pp. 97 ff. 
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It should be emphasized that it would be most desirable to combine 
the two methods, i.e. take the disturbances y;; into account as well as 
the random term w. Unfortunately it seems that there are great prin- 
cipal difficulties present. 

Assume that we have estimates of the variances of the disturbances 
yic in the population corresponding to our sample. These estimates V; 
may be based upon the Variate Difference Method,‘? upon deviations 
from time regression functions, e.g. polynomials and Fourier series, or 
may be given a priori. We could also use the method of principal com- 
ponents as indicated above. The contribution of the first component 
may be considered as the systematic part and everything else as the 
random error. We assume that the estimates V; are accurate enough to 
enable us to treat them as constants. Then, assuming normality and 
independence of the disturbances y;, we apply the method of maximum 
likelihood which leads under our assumptions to the method of least 
squares. 

Denote the deviations of the X;, and of the M;, from their respective 
means by x; and m;;. The weighted sum of squares to be minimized un- 
der the condition (35) is then: 








N =< Gia = 2 
(36) Q = ie (1 it) ngs (pt ed 
t=l Vi V> 


It should be noted that the weights are the reciprocals of the error vari- 
ances V; 

It appears that the constant ko is determined by the condition that 
the optimum solution has to pass through the means of all variables: 


(37) ko + kiki + keXe +--+: +kyX>p = 0. 
The best solutions for the remaining regression coefficients k, « - - , kp 
are given by: 
Quki + dike +--+ + aipky = Viki 
(38) 
Qipki + Gopke + +++ + Appkyp = AV kp. 


This is a homogeneous linear system in the unknowns k; - - - kp. It is 
again very similar to the normal equations occurring in classical multi- 
ple regression analysis. It can only have a non-trivial solution if its de- 
terminant becomes zero: 


«7G. Tintner: The Variate Difference Method, Bloomington, Ind., 1940. 
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Q@y—AVi Qy-:: ip 





(39) = 0. 


| Qip Gop *** App — AV, | 


The smallest latent root of the determinantal equation (39) can be 
shown to be the minimum of Q divided by (V—1). 

By finding the smallest root \, of the determinantal equation and in- 
serting it into the previous system (38) we determine the weighted re- 


gression coefficients k, ---, kp. 

In equation (35) we can evidently choose one of the k, - - - kp by an 
arbitrary condition. Hence putting e.g. k; = —1 we will change it into: 
(40) ke’me: + ks’mge + ++ - + kp’mMpt = mM (@=1,2,---,N). 
The regression coefficients k,’ - - - , ky’ are now given by the system of 
equations: 


(Q22 — A1V2)ke’ + aesks3’ + --- + Geykp’ = Ay 
(41) 
apke’ + dgpks’ +--+ + (App — MV p)kyp’ = arp. 
The k,’ - - - k,’ are our estimates of the weighted regression coefficients 


in the population corresponding to our sample. 
We can use the determinantal equation (39) for a test for collinear- 


ity: Let A, Ae, . . . be the smallest, next smallest . . . ete. roots of equa- 
tion (50). We form the test functions: 

(42) Ay = (N — 1): 

(43) As == (N —_ 1)(Qy + 2). 


Then it has been shown by Hsu‘* that the test functions A; and A» are 
for large samples distributed like x? with N —p and 2(N —p-+1) degrees 
of freedom respectively. If A; is smaller than the x? computed for a 
given level of significance (e.g. 5 per cent or 1 per cent), we may con- 
clude that there are probably at least 2 independent relationships be- 
tween our p variables in the population corresponding to the sample. 
Hence we have probably collinearity in the sense of Frisch and it is 
appropriate to fit not one but at least two linear relationships of the 
type (35). By computing other test functions we can actually estimate 
the number of independent linear relationships which probably exist 
in the population corresponding to our sample. 

43 P. L. Hsu: “On the Problem of Rank and the Limiting Distribution of Fisher's Test Function,” 


Annals of Eugenics, Vol. 11, 1941, pp. 39 ff. See also: G. Tintner: “A Note on Rank, Multicollinearity 
and Multiple Regressions,” Annals of Mathematical Statistics Vol. 16, 1945, pp. 304 ff. 
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We can also apply a test of significance for the individual k,’ given 
by Koopmans which is only approximate. 

Denote by c;; the element of the inverse of the matrix used to com- 
pute the k,’ in (41). This inverse may be computed by the methods 
given by R. A. Fisher.‘ Then the standard error of the coefficient k/ 
which we denote by s; is given approximately by: 


(44) 8,2 = ci(Vi + ke!2Vo + hy’2V3 + +++ + kp'2V,)/(N — ). 


The ratio k;’/s;is approximately distributed like Students’ t with N—p 
degrees of freedom. The ¢ computed in this fashion may also be used 
to establish fiducial or confidence limits for the weighted regression co- 
efficients. The covariance of k,’ and k;’ may be computed by similar 
methods. The distribution of the variances and covariances has re- 
cently been established for some special cases.*° 


Example 8 


The method was applied by Koopmans® to the ship freight market, 
for the period 1880-1911. 

Let X, be the freight index (1900=100), X2 transport (billions of 
ton-miles), X; tonnage (millions of tons), X,4 coal price (shillings per 
ton). All variables are expressed in percentages of the trend and the 
weighted regression equation is: m,;=0.66m2+0.29m3;+0.46m4. 


Example 9 


In order to compute these weighted regression coefficients Koopmans 
had to assume somewhat arbitrarily a set of weights, i.e., the error 
variances V;. The author® has endeavored to estimate these weights by 
the Variate Difference Method in a study of agricultural production in 
the U. S., 1920-1941. 

Let X, be the logarithm of the volume of agricultural production, X2 
the logarithm of employment in agriculture, X; the logarithm of operat- 
ing capital and X, time. The weighted regression equation appears then 
as: m,=2.7735 m2+0.9020 m3+0.0087 m,. The first two coefficients 
of this “Douglas type” production function are elasticities with respect 
to labor and operating capital, while the third coefficient represents 
an exponential trend. It*appears, for instance, that an increase of agri- 
cultural employment by 1 per cent will result in an increase of agricul- 
tural production by about 2.8 per cent, etc. 

49 R. A. Fisher: Statistical Methods for Research Workers, 8th ed., New York, 1941, pp. 150 ff. 

so T. W. Anderson and M. A. Girshick: “Some Extensions of the Wishart Distribution,” Annals 

of Mathematical Statistics, Vol. 15, 1944, pp. 345 ff. 


51 Op. cit., pp. 115 ff. 
82 G. Tintner: “An Application of the Variate Difference Method to Multiple Regression,” loc. cit. 
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Example 10 


The author® has also applied the method of weighted regression in 
an attempt to find a demand and a supply function for agricultural 
products in the United States. 

Denote by X, prices received by farmers for agricultural products, 
by X2 national income, by X; agricultural production, by X, time 
(origin between 1931 and 1932) and by Xs prices paid by farmers. The 
data were given annually for the 24 years 1920-43. 

An analysis of the data by weighted regression necessitates the esti- 
mation of the error variances. These have again been established by the 
Variate Difference Method. 

An investigation of the problem with the help of the method ex- 
plained above shows that there are probably two relationships between 
the 5 variables. Other tests show that there is probably one relation- 
ship between the variables X,, X2, X3; and X, and one between the 
variables X,, X3, X, and X;. The first is evidently the demand function 
and the second the supply function. It should be noted that the inclu- 
sion of X2 (national income) in the first set and of X5 (prices paid by 
farmers) in the second set serves to make the relationships economically 
meaningful. In this way we identify the first weighted regression equa- 
tion as the demand function and the second as the supply function. 

Denoting deviations from the means by m; we have for the demand 
function: 

(45) mz = — 0.097m, + 0.429mz2 + 0.3134, 


and the supply function: 
(46) ms = 1.721m, + 0.804m, — 3.611. 


It appears from statistical tests that the results for the equation (45) 
are more reliable than for (46), as we should expect. Agricultural supply 
depends largely on the weather and other factors not included in the 
analysis. 

We can compute elasticities from these equations which are based 
upon the means of the variables over the period. The price elasticity 
of demand established from the first equation is —0.123. This is to 
say, other things equal an increase of 1 per cent in the prices of agricul- 
tural products results in a decrease of about 0.1 per cent in the quantity 
demanded. The income elasticity of demand is 0.307. Fiducial or con- 
fidence limits can also be established for these elasticities. They appear 
for the price elasticity of demand as —0.052 and —0.195 at the 5 per 


83 G. Tintner: “Multiple Regression for Systems of Equations,” loc. cit. 
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cent level. A test of significance shows that it is highly probable that the 
income elasticity of demand is definitely greater than the price elastic- 
ity. The importance of these tentative results for economic policy are 
obvious. 

Example 11 


We propose to illustrate the method of weighted regression further 
by an example. We will endeavor to fit a production function for the 
whole American economy in the period 1921-1941, using yearly data 
for these 21 years. This is mainly an effort to continue the work of Mr. 
Paul Douglas and his collaborators.™ 

X, denotes the logarithm of labor in the U. 8S. both industrial and 
agricultural labor in million workers. X2 is the logarithm of total stock 
of fixed capital in the economy measured in billions of 1934 $. X; is the 
logarithm of total private net output also in billions of 1934 $. X, is 
time measured from 1931 as origin. X, is taken from statistical data 
published by the Department of Agriculture and the Bureau of Labor 
Statistics. The other data have been taken with kind permission from 
an unpublished essay by Mr. L. R. Klein.® 

The means of the data are given in Table 5 below: 


TABLE 5 
Variable Symbol Mean 
log labor Xi 1.651728 
log fixed capital Xs 2.051152 
log production Xs 1.768762 


We have indicated above that various methods are availabie for 
estimating the variances of the randon elements V;. We can utilize 
the Variate Difference Method for this puropse if the following condi- 
tions are fulfilled: Each variable consists of the mathematical expecta- 
tion or systematic part M,, which is a smooth function of time, plus 
the random or error part y;:. Then we can eliminate or at least greatly 
reduce the systematic component by taking differences. If difference 
series of a high enough order are computed we will eventually have 
eliminated the systematic part entirely or at least sufficiently. Hence 
this difference series and all higher difference series will consist of the 
random part alone, or at least substantially of the random component 
Yit. 

In order to form an idea in which difference series this is the case we 
compute by appropriate formulae the variances of the successive dif- 
ference series. 

« P. H. Douglas: Theory of Wages, New York, 1934. See also H. T. Davis: Theory of Econometrics, 


Bloomington, Ind., 1941, pp. 153 ff. and the literature quoted on p. 159. 
% L. R. Klein: Economic Fluctuations in the U. S., 1921-41. 
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We need estimates of the error variances of X;, X2, X3. The following 
table gives the variances of various difference series of the variables: 


TABLE 6 
VARIANCES OF DIFFERENCE SERIES 


Order 
of X, log labor X: log capital X: log product 
Difference 
1 0 .00026497 0.00001897 0 .00082478 
2 0.00012103 0.€0000250 0 .00026453 
3 0.00010135 0.00000112 0 .00020488 
7 0.00009259 0.00000080 0.00018496 
5 0.00007757 0 .00000065 0.00016422 


Tests® indicate that the second difference of X,, the third difference of 
X; and the second difference of X; give under our assumptions reasona- 
bly accurate estimates of the “true” error variances of the variables in 
question. The error variances are for the reader’s convenience repre- 
sented in the following table: 


TABLE 7 
ERROR VARIANCES 
Variable Symbol Error Variance 
log labor Xi 0.00012103 
log capital Xs 0.C00000112 
log product Xs 0.00026453 


These error variances are assumed to be estimated with enough ac- 
curacy so that we can treat them as constants. This assumption is prob- 
ably not fully justified in our case. 

The resulting weighted regression equation will only represent the 
production function if, apart from an exponential trend, this production 
function was stable or at least reasonably stable over the period con- 
sidered, while there were fluctuations in the other relationship involved 
in production, especially the supply of productive services, the demand 
for the product. This assumption seems to be approximately justified 
because of the fluctuations of those quantities during the business cycle. 

We want to fit the weighted regression functions: 


(47) kym, + kome + ksms + kam, = 0 


taking into account the fundamental assumptions of our method. The 
linear equations for the coefficients are derived from the variance-co- 
variance matrix of the variables: 


TABLE 8 
VARIANCE COVARIANCE MATRIX 
Xi Xs Xs Xs 
Xi 0.00093135 0 .00000045 0.00187985 0 .00515500 
Xs 0.00030905 0 .00045005 0 .06067500 
Xs 0.00562850 0.25020000 
xX. | 38 . 50000000 


“ G. Tintner: The Variate Difference Method, op. cit., pp. 67 ff 
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The determinantal equation (29) becomes: 


0.00093 135 —0.00012103X 0.00000045 0.00187085 0.00515500 
0.00000045 0.00030905 —0.00000112 0.00045005 0.06067 500 

” 0.00187085 0.0045005 0.00562850 —0.00026453A 0.2502000 ” 
0.00515500 0.06067500 0.25020000 38.50000000 


The two smallest roots of this equation are: \;= 0.5482 and \, = 22.304. 
The first can be used to form the test function: A:=20 (0.5482) 
= 10.9640. This is for large samples distributed like x? with 17 degrees of 
freedom. The x? permitted at the 5 per cent level is 27.587 and at the 1 
per cent level: 33.409. Hence A; is not significant. 

Next we compute A, =20(0.5482+ 22.304) =458.044. This is again 
distributed like x? with 36 degrees of freedom. For the 5 per cent level of 
significance we get a permissible x? of 50.714 and for the 1 per cent level 
57.804. Our empirical A? is significant. Hence we conclude that it is un- 
likely that there is more than one linear relationship between the 4 
variables X,, X2, X; and X, in the population corresponding to our 
sample. We do not seem to have multicollinearity. 

Inserting the smallest root \;=0.5482 into the determinantal equa- 
tion (48) we get a matrix for the computation of the regression coeffi- 
cients of our weighted regression: m3 = k,’m,+k2’m2.+k4/m4. 

The equations to be solved are in our case: 


0.00086500k,’ + 0.00000045k2’ + 0.00515500k,’ = 0.00187085 
(49) 0.00000045k,’ + 0.00030844k.’.+ 0.06067500k,’ = 0.00045005 
0.00515500k,’ + 0.06067500k2’ + 38.5000000k,’ = 0.25020000 


The solution is given in the following weighted regression equation: 
(50) mz; = 2.128806m, + 0.338665m, + 0.005680m,. 


This production function gives estimates of the elasticities of produc- 
tion with respect to labor and fixed capital and also a time trend as they 
presumably exist in a hypothetical population. For instance, other 
things equal, an increase of 1 per cent in the total fixed capital will in- 
crease the total product by more than $ per cent. An increase in the total 
labor force by 1 per cent will increase the product by more than 2 per 
cent. The last term represents an exponential trend. It has to be inter- 
preted in this way: Production increased about 3 per cent each year 
during the period. This last estimate agrees with earlier estimates of 
Carl Snyder®’ and others. 


87 C. Snyder: Capitalism the Creator, New York, 1940. 
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The weighted regression equation (50) given above should be com- 
pared with the regression equation which has been derived by classical 
methods: 


(51) 3 = 1.976985, + 0.33232822 + 0.005710z,. 


This, however, is designed to predict most successfully 2; if x1, x2 and 
24 are given. 

Using the geometric means whose logarithms appear in Table 5, we 
can also compute the marginal productivities of capital and labor. 
From the weighted regression coefficients we get for the marginal 
productivity per worker $2,279.63 and for the marginal productivity 
of the stock of fixed capital der pollar: $0.292. This is to say: If condi- 
tions are on the average the same as in the period considered, we con- 
clude that other things equal the addition of one worker will result 
in an increase in the national product by about $2,000. The addition 
of one more dollar to the stock of fixed capital will bring about, ceteris 
paribus, an increase in the net national product of almost $.30. Both 
these estimates appear somewhat high, but maybe not excessively so in 
the light of some previous investigations in the field of agricultural 
production functions.’* These results are, however, not strictly com- 
parable to our production function which has been derived for the whole 
economy. 

All these results should be interpreted in the light of their statistical 
variability as described by their approximate standard errors. The ma- 
trix inverse to the one used in computing our weighted regression equa- 
tion (5) is given in the following table: 


TABLE 9 
INVERSE MATRIX 
Xi Xs Xs | 
Xi | 1,157.363 41.734 —0.221 
Xs 4,700.355 —7.413 
X I} 0.038 l 


Using these data and the previous results we compute the approxi- 
mate standard errors of the weighted regression coefficients. The stand- 
ard error of the coefficient of m, in the weighted regression equation 
(2.128896) turns out to be 0.174; the one of the coefficient of mz» 
(0.338665) appears as 0.351; and the one of the coefficient of mz, 
(0.005680) is 0.0009994. Using the t-test, we see that the corresponding 
values of ¢ are: 12.235, 0.965 and 5.714. The ¢ required for 17 degrees of 


58 G. Tintner and O. H. Brownlee: “Production Functions Derived from Farm Records,” Journal of 
Farm Economics, Vol. 26, 1944, pp. 566 ff. 
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freedom at the 5 per cent level of significance is 2.111 and the one for 
the 1 per cent level is 2.898. It turns out that the coefficients of m, and 
m, are highly significant, but not the one of me. Hence it would appear 
that we can perhaps with some accuracy determine the elasticity of 
production with respect to labor, but not the one with respect to the 
stock of fixed business capital. A possible explanation of this is, that 
the effects of an increase in fixed capital may not appear in the same 
year but in subsequent years. The time trend can also be determined 
with reasonable accuracy. All these results are only of an approximate 
nature. 

Finally we want to give fiducial or confidence limits for our estimate 
of the elasticity of production with respect to labor. Using a confidence 
coefficient of 99 per cent, we get for the limits of the elasticity: 2.633 
and 1.625. This has to be interpreted in the following way: The chances 
are 99 in 100 that an increase in the total labor force by 1 per cent will 
increase the product by not more than about 2.6 per cent and not less 
than about 1.6 per cent. These are pretty wide limits and emphasize 
the tentative nature of our conclusions. 

The same type of analysis can also be applied to the marginal 
productivity of labor. Using a confidence coefficient of 99 per cent we 
get for these limits: 2,819.54 and 1,740.13. Ceteris paribus, under con- 
ditions approximately the same as the ones prevailing in the period 
considered, we can make this statement: The chances are 99 in 100 that 
an increase of the labor force by one worker will result in an increase of 
the total national product by not more than about $2800 and not less 
than about $1700. The latter figure is probably nearer to the true value. 

We want to stress finally that the results for the production function 
of the whole United States should not be taken too seriously. Our data 
are perhaps not quite adequate for the determination of such a func- 
tion. The economic meaning of a production function representing all 
enterprises is also somewhat doubtful. It would be more desirable to try 
to fit production functions of the Douglas type to specific industries. We 
believe, however, that the methods indicated above should be tried 
in the statistical analysis of such a problem. 
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REPRODUCTION RATES ADJUSTED FOR AGE, 
PARITY, FECUNDITY, AND MARRIAGE 


P. K. WHELPTON 
Scripps Foundation for Research in Population Problems 


The reproduction rates computed in the past have been 
age-adjusted, i.e. based on age specific birth rates. Because 
order of birth and parity of mother were ignored, these rates 
have had an upward bias in some years and a downward bias 
in others. The omission of an allowance for marriage and fe- 
cundity has had a similar effect. The reasons for these biases 
are analyzed; a method for utilizing age-parity specific rates 
and allowing for spinsterhood and sterility is described; and 
the different types of rates are shown for selected years. 


HE gross or net reproduction rates and intrinsic rates of natural 
eee computed in the past have been adjusted for age. They 
show what would occur if the age specific birth and death rates of 
females in the various age cohorts of an actual population during the 
base period were to apply to a hypothetical cohort of females during 
its lifetime.! These age-adjusted rates have been extremely useful to 
demographers, and have been considered highly accurate measures of 
the fertility of the base period. As far as the writer can ascertain, how- 
ever, no one has analyzed adequately the validity of certain phases of 
the methodology. Is it theoretically possible for the age specific birth 
rates of any actual population during any base period to remain in effect 
throughout the life time of a hypothetical cohort? When applying the 
birth rate of the women of a given age in an actual population to the 
women of that age in the hypothetical cohort, is it correct to ignore the 
previous birth rates of the actual women? What is the effect of disre- 
garding the incidence of sterility and spinsterhood? In short, should re- 
production rates be based on birth rates which are specific for parity, 
marriage, and fecundity, as well as age? And if so, how much would 
they be changed? These are the questions which will be discussed here. 


1. CAN A HYPOTHETICAL COHORT HAVE THE AGE SPECIFIC 
BIRTH RATES OF AN ACTUAL POPULATION? 


Heretofore the computation of a gross or net reproduction rate or an 
intrinsic rate of natural increase has required the use of age specific 
birth rates. Conventionally, the average annual numbers of live births 
to white (or colored) women during the base period are classified by 


1 In a few cases reproduction rates have been computed for males and the tota population. See, 


for example, Myers, Robert, J.: The Validity and Significance of Male Net Reproduction Rates. 
275-282 


Journal of the American Statistical Association, Vol. 36, No. 214, June 1941, pp. 275- 
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5-year age groups (15-19, 20-24, etc.) of mother as of time of birth, 
corrected for underregistration, and divided by the numbers of white 
(or colored) women in the corresponding age groups at the middle of the 
base period. It is assumed that a hypothetical cohort of white (or 
colored) women living through the childbearing period will have (1) 
at ages 15 through 19 five times the average annual birth rate which 
white (or colored) women aged 15-19 had in the base period, (2) at 
ages 20 through 24 five times the average annual birth rate which 
white (or colored) women aged 20-24 had in the base period, etc. The 
total number of births to the hypothetical cohort is computed, and 
multiplied by the percentage of infants that are girls to obtain the 
number of female births.? 

The births by age of mother used in computing the age specific birth 
rates just described include births of several orders. In 1942, for ex- 
ample, births of the first to sixth orders were registered as occurring 
to women aged 15-19, and births of the first to 22nd orders to women 
aged 40-44. Up to the present, however, there apparently has been no 
reference to order of birth in the methodology of computing a gross or 
net reproduction rate or an intrinsic rate of natural increase. One ex- 
planation of the omission could be that within each age group the rate 
for first births, the rate for second births, and the rate for births of each 
other order are assumed implicitly to apply to the hypothetical cohort, 
just as the sum of these rates (the age specific rate based on births 
regardless of order) is assumed to apply to it. Another explanation 
could be that within an age group compensating changes are assumed 
to occur in rates by birth order. For example, it could be assumed that 
at ages 20-24 the first birth rates of the hypothetical cohort would be 
10 per cent lower than those of the actual cohorts, but the rates for other 
birth orders would be sufficiently higher so that the total number of 
births at these ages would not be affected. But because the methodology 
emphasizes the assumption that the age specific rates (for all birth 
orders combined) of an actual population apply unchanged to a theo- 
retical cohort—not that there be compensating changes—it is logical to 
believe that the first assumption mentioned has been made implicitly, 
namely that the age specific rates by order of birth of an actual popula- 
tion apply to a hypothetical cohort living through the childbearing 
period. 

Is this implicit assumption theoretically possible for any given 

2 Using specific birth rates by single years of age instead of rates by 5-year age groups multiplied 
by five does not change significantly the gross or net reproduction rate, nor the intrinsic rate of natural 
increase, unless the distribution of women and/or of births by age of mother within the 5-year age 


groups is very abnormal. Similarly, using age specific percentages of infants that are girls has little ef- 
fect on the final results. 
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year? The answer may be obtained from an inspection of the rates for 
first births, second births, and births of other orders per 1,000 native 
white women by 5-year age groups for each year from 1940 to 1944 in 
the United States. Adding the age specific rates for all births regardless 
of order and multiplying by five are essential steps in computing the 
gross reproduction rate, and give the number of births of all orders per 
1,000 women living through the childbearing period in the hypothetical 
cohort. The same procedure applied to first birth rates gives the first 
births per 1,000 women living to age 50 in the hypothetical cohort. If 
the cohort had the 1940 age specific first birth rates, 1,000 women 
would have 820 first births. (See Table 1.) With the 1944 rates there 
would be 868 first births per 1,000 women, and with the 1941 rates 
916 first births. Such fertility is possible in theory or practice. But for 
1,000 women to have 1,084 first births, the result obtained for 1942, is 
impossible both practically and theoretically, even for a hypothetical 
cohort.‘ And for 1,000 women to have 997 first births (the result ob- 
tained for 1943), is equally impossible in view of what is known about 
the incidence of sterility. Second and higher order births present no 
such problem in the years for which data are available, because the age 
specific rates for these birth orders are substantially smaller in the 
aggregate than those for first births. 
TABLE 1 


FIRST BIRTHS PER 1,000 NATIVE WHITE WOMEN BY 5-YEAR AGE PERIODS, 
UNITED STATES, 1940 TO 1944 





Central First Birth Rates* 








Age 

Period 1940 1941 1942 1943 1944 
15-19f 35.09 37.88 42.41 42.47 36.70 
20-24 65.88 75.01 90.49 81.00 72.60 
25-29 40.87 45.99 55.38 48.24 39.31 
30-34 16.41 18.03 21.17 19.88 17.10 
35-39 : 4.82 5.34 6.37 6.74 6.68 
40-44 .81 .88 .95 1.09 1.23 
45-49 -06 -05 .05 .06 .06 
Total 163 .94 183.18 216.82 199.48 173 .68 
Total X5 819.70 915.90 1,084.10 997 .40 868 .40 





* These rates are adjusted for incomplete registration in accordance with data of the Division of 
Vital Statistics based on the test period December 1, 1939 to April 1, 1940. 

t Includes the few births to women younger than 15. 

t Includes the few births to women 45 or older. 


? In computing a gross reproduction race it is assumed that if the women who die before the end of 
the childbearing period had lived they would have the same age specific birth rates as those who live 
to the end of the period. 

4 In (nis connection it should be remembered that multiple live births are recorded and tabulated 
as births of two or more orders. Thus if twins are borne st a first pregnancy, order of birth is noted as 
first on the birth certificate for one baby and as second on that for the other. 
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The conclusion may be stated in general terms as follows: If the 
various cohorts composing an actual population have a sufficiently 
large number of first births in a certain year (or years), it is invalid to 
assume in computing reproduction rates’ that a hypothetical cohort 
will have during its reproductive life time the age specific rates for 
first births of the various cohorts during that year (or years). Further- 
more, the fact that the foregoing assumption yields impossible results 
for certain years casts doubts on the accuracy of the results for other 
years. Fortunately the situation can be improved in one of three ways. 
(a) The second assumption listed above can be made, namely, that the 
age specific rates for first births will be smaller for the hypothetical 
cohort than for the actual population, but these declines will be exactly 
balanced as far as numbers of births are concerned by larger age specific 
rates for births of higher orders. Obviously this is objectionable for 
several reasons. (b) The base period can be lengthened in the hope that 
the peculiarities of individual years will cancel out. This is undesirable 
because it is important to measure year to year changes as well as the 
average situation during several years. (c) Birth rates can be computed 
for the actual population which can be applied correctly to a hypo- 
thetical cohort. This suggestion seems most promising, hence it will 
be analyzed in detail. 

The first revision in procedure that is needed in order to improve the 
accuracy of reproduction rates for any year is the inclusion of an adjust- 
ment for order of birth of child and parity of women. From a 
theoretical standpoint the use of birth rates which are specific for birth 
order and parity as well as age is a decided improvement, and should 
have been advocated long ago by demographers. The delay probably is 
due in part to a tendency to accept as adequate in fertility analysis the 
procedures developed and used earlier in mortality analysis. Unfor- 
tunately three fundamental differences have heen overlooked in this 
carry-over.’ First, although a cat proverbially has nine lives, each 
woman (no matter how ‘‘catty” she may be) has only one; she must 
die, but she can die only once. In contrast, the fecundity and fertility 
of women vary widely. Some cannot have a child; many have two or 
three; a very few have 20 or more. Because of this difference, age specific 
birth rates are much less adequate in preparing a reproduction table 
than are age specific death rates in preparing a life table. To raise the 


‘ ? 


5 For the sake of brevity the phrase “reproduction rates” will be used hereafter in referring to the 
three rates—gross reproduction rates, net reproduction rates, and intrinsic rates of natural increase— 
as a group. 

6 Parity as used in this paper means the number of children born alive. A zero parity woman has 
not borne a child alive, a first parity woman has had one live birth, ete 

7 The other two differences relate to sterility and spinsterhood, and are discussed in sections 3 
and 4. 
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former to the level of the latter requires (among other things) recog- 
nition of the relation between parity of woman and birth order of child. 
First births can occur only to zero parity women, second births can 
occur only to first parity women, etc. In short, only the women of 
n parity can be exposed to the risk of bearing a child of n+1 order. 

From a practical standpoint the use of age specific birth rates by 
order of birth and parity of women® for the United States was impos- 
sible a few years ago because there were so few statistics regarding the 
distribution of women by parity. Fortunately, the parity question was 
reinstated in the 1940 census, and tabulations by parity were made 
on a sample basis for this census and that of 1910. Tables are now avail- 
able which show the number of women classified by color, nativity, age, 
and number of children ever born for the United States and certain 
subdivisions. 

If the age-parity specific birth rates of an actual population are ap- 
plied to a hypothetical cchort, it is impossible for a gross reproduction 
rate to show that 1,000 women have more than 1,000 first births. The 
explanation is simple. First births at each age of mother in the base 
year are related to zero parity women of that age, and the resulting 
rates are applied to the zero parity women in the hypothetical cohort. 
At the beginning of the childbearing period all of the women in the 
cohort are of zero parity. As these zero parity women have a first 
birth at any subsequent age they are transferred to the first parity 
group. If the age specific zero parity first birth rates are sufficiently 
high, all the women in a hypothetical cohort are transferred from 
zero to first parity before reaching the oldest childbearing age, in 
which case the upper limit of 1,000 first births per 1,000 women in the 
cohort is reached, but not passed. In no year for which data are avail- 
able, however, has this limit been approached closely. Even with the 
age specific zero parity birth rates of 1942, the highest on record, 1,000 
women living to age 50 would have only 875 first births. Similar reason- 
ing shows that the number of second births in the hypothetical cohort 
cannot exceed the number of first parity women, or the number of 
third births exceed the number of second parity women, etc., when the 
base rates are age-parity specific. Actually, the theoretical upper limits 
are not approached closely. 


2. ALLOWING FOR THE RELATION BETWEEN THE FERTILITY OF A 
COHORT AT A GIVEN AGE AND ITS FERTILITY AT YOUNGER AGES 


The use of age-parity specific birth rates in computing reproduction 
rates avoids impossible assumptions for years like 1942 because it pro- 


8 Such rates will be referred to hereafter as age-parity specific rates. 
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vides for the relation between (a) birth performance at a given age 
and (b) birth performance at younger ages. As was pointed out above, 
transferring women from zero parity to first parity when they have a 
first birth and using birth rates by parity make it impossible to assign 
more than one first birth to any woman. This is desirable for obvious 
reasons. Less obvious, but important nevertheless, is the fact that be- 
cause age-parity specific birth rates for a given year give proper weight 
to the fertility in prior years of each age cohort in the actual popula- 
tion, they measure more accurately than age specific rates the birth 
performance of the population in the year in question. For example, 
during the depression years of the 1930’s there was a tendency for 
marriage and the starting of a family to be delayed, which raised the 
proportion of women who were childless to a relatively high figure, 
especially at the younger childbearing ages. In contrast, prosperity and 
war has encouraged marriage and the starting of a family between 
1940 and 1945, and reduced to a relatively low figure the proportion of 
the younger women who are childless. If a comparison of the fertility of 
the younger childless women in 1940 and 1945 is based on (a) age 
specific first birth rates and (b) age-parity specific first birth rates, the 
relative position of 1945 is more favorable on the latter basis than on 
the former. In this case, of course, the comparison more favorable to 
1945 is correct because it reflects the differences in the relative num- 
ber of zero parity women in the two base years. For the same reason 
a comparison between the reproduction rates for 1940 and 1945 is 
more accurate if computed from age-parity specific birth rates than 
from age specific rates. It is possible in theory for the bias mentioned 
to be equalized exactly by biases in the opposite direction for’ other 
ages or other birth orders. That exact equalization will happen in 
practice, however, is unlikely. 

Basing reproduction rates on age-parity specific birth rates has the 
additional advantage of allowing for the fact that the computed birth 
performance prior to a given age of women in a hypothetical cohort will 
differ in most cases from the actual birth performance prior to the same 
age of the women in the actual population whose rates are being ap- 
plied to the cohort. This is important because, as explained above, 
fertility at a given age is affected by fertility at younger ages. For 
example, seventh or higher order births constituted more than half 
of the births in 1940 to native white women aged 40—44, and almost all 
of them occurred to mothers who had borne at least six children before 
1940.° In computing a conventional 1940 reproduction rate it is as- 
sumed that the women in the hypothetical cohort will have at ages 


* The few exceptions are the women who had multiple births or two confinements, or both, in 1940. 
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40-44 the 1940 age specific birth rates at these ages. But whereas the 
proportion of the women aged 40-44 that had borne six or more 
children was 13 per cent for the actual 1940 population it would be 
only 6 per cent for the hypothetical cohort. In consequence the num- 
ber of seventh or higher order births to 1,000 women in the hypo- 
thetical cohort at ages 40-44 is substantially lower, and correspond- 
ingly more accurate, when computed from age-parity specific rates 
than when computed from age specific rates. It might be thought that 
the smaller number of seventh or higher order births yielded by parity 
specific rates at ages 40-44 would be offset by a larger number of first 
to sixth births, because of the larger proportion of zero to five parity 
women in the hypothetical cohort than in the 1940 population. Such 
is not the case. Instead, the total number of births per 1,000 women 
aged 40-44 is 14.9 for the actual population but only 10.7 for the hypo- 
thetical cohort. Obviously the reproduction rates based on age specific 
birth rates are biased accordingly. As before, such biases could happen 
to be exactly compensating, but the odds are heavily against it. 


3. ALLOWING FOR THE EFFECT OF STERILITY 


The basic age specific birth rates used in computing conventional 
reproduction rates are obtained by relating births to the fecund plus 
the sterile persons in the total population (or in the race, nativity, sex, 
or other group in question) rather than to the fecund persons only.'° 
Failure to exclude sterile persons may be due in part to the tendency 
suggested earlier to carry over to fertility analysis the procedures 
used previously in mortality analysis. Relating births, like deaths to 
all persons is objectionable in theory, however, because it ignores the 
fact that whereas each person must die sooner or later, some persons 
in the so-called reproductive age groups cannot become parents. Speak- 
ing in terms of women, no women of any age can avoid exposure to the 
risk of dying at that age. In contrast, some women of an age within 
the childbearing period cannot, for physiological reasons, be exposed 
to the risk of having a child at that age. Deaths of women of a given 
age should be related to all women of that age, for all of them are at 
risk. Similarly, n order births at a given age should be related to the 
women who can have such a birth, namely, the fecund n—1 parity 
women of that age, and not these women plus the n—1 parity women 
of the same age who are sterile. In discussing this problem a distinction 

1 Fecund and sterile are used in accordance with the definitions adopted by the Population Associa- 
tion of America, namely: a fecund person has the physiological ability to participate in reproduction 
at the age or time in question; a sterile person lacks this ability. 


This analysis deals with one method of allowing for sterility. If age specific rates for the onset of 
sterility were available, the analysis and conclusions would be somewhat different. 
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will be made between full time or complete sterility—the lack of the 
ability to participate in reproduction at any time—and part time or 
partial sterility—the lack of the ability at one time but not at another. 
The reasons for this distinction will become evident during the discus- 
sion. 

From a practical standpoint there are two explanations for not allow- 
ing heretofore for the effect of complete sterility in computing repro- 
duction rates. The first is the lack of information about the number cf 
sterile persons in the population, and will be considered later. The 
second is that a reproduction rate computed from age specific birth 
rates is not affected by an allowance for sterility. For example, the 
net reproduction rate for native white women in 1942 based on age 
specific birth rates with no allowance for sterility is 116. (See Table 2, 


TABLE 2 


THE NET REPRODUCTION RATE FOR NATIVE WHiTE WOMEN IN 1942 
COMPUTED FROM AGE SPECIFIC BIRTH RATES, WITH AND WITH- 
OUT AN ADJUSTMENT FOR STERILITY 








Births per Births to 





Number of Births per Percentof Number of 
4 1,000 fecund women in 
women in 1,000 women women fecund women . ; 
Exact . . ' ‘ women in hypothetical 
hypothetical in actual assumed in hypothetical 
Ages cohort* populationt fecund{ cohort (A XC) actual cohort (4 XB) 
‘ populationt or (D XE) 
A B Cc D E F 
15-19 473 ,959 52.60 90 426 ,563 58.44 24,930 
20-24 470,869 162.50 90 423 ,782 180.56 76,516 
25-29 466 ,865 144.04 90 420,178 160.04 67 ,247 
30-34 462 ,031 91.51 90 415,828 101.68 42,280 
35-39 455 ,983 48.11 90 410,385 53.45 21,937 
40-44 448,110 14.42 90 403 ,299 16.02 6 , 462 
45-49 437 ,429 1.32 90 393 ,686 1.47 577 
Total 239 ,949 
116 


Net Reproduction Rate 





* 1, values from a life table for 1942 computed by Scripps Foundation for Research in Population 


Problems. 
t See footnotes of Table 1. The figures in these columns are probabilities rather than central rates, 


for they are obtained by relating (a) the number of births occurring during the age interval z to r+1 


to (b) the number of women of exact age z. 
¢ Any percentage which does not give a rateof over 1,000 in Column E could be used for illustrative 


purposes. 


Col. F.) If it is assumed that 10 per cent of the women are completely 
sterile, fecund women will constitute 90 per cent of all women of each 
age in the hypothetical cohort. (See Cols. A and D.) But for the same 
reason, fecund women will constitute 90 per cent of all women of each 
age in the actual population, hence the age specific birth rates for 
fecund women in the actual population (Col. E) will be 111.1 per cent 
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of those for all women (Col. B). In consequence the number of births 
to the women in the hypothetical cohort (Col. AXCol. B, or Col. 
D XCol. E) is the same in either case, as is the net reproduction rate. 
Treating each single year of age separately does not affect the com- 
parison. 

If a reproduction rate is based on age-parity specific rates, however, 
an allowance for complete sterility changes the results. This may be 
illustrated by using the fertility and mortality data for native white 
women in 1942 and assuming that either none or 10 per cent of the 
women are completely sterile, as before, but computing the net re- 
production rates from age-parity specific birth rates. Because these 
rates must be used by single years of age a table including all birth 
orders at all ages would require a large amount of space. In conse- 
quence, the results are shown in detail only for first births at ages 
15 through 24. (See Table 3.) If no allowance is made for sterility, a 
hypothetical cohort of 100,000 women will have 57,774 first births by 
exact age 25, and 82,400 by age 50. But if it is assumed that 10 per 
cent of the women are completely sterile, 100,000 women in the hypo- 
thetical cohort will have 57,073 first births by exact age 25, and 79,200 
by age 50. Births of other orders will be reduced in similar degree. 

The reason why an allowance for complete sterility does not affect a 
reproduction rate based on age specific birth rates but does affect one 
based on age-parity specific rates may be explained briefly. In the former 
case it was pointed out that the relative allowance is exactly the same 
at each age in the actual population and the hypothetical cohort, and 
is exactly offset by higher birth rates. In the latter case the relative 
allowance at each age is exactly the same for all women in the actual 
population and in the hypothetical cohort but not for the zero parity 
women. For example, the assumption that 10 per cent of the women are 
completely sterile at each age results in a group of such women amount- 
ing at age 20 to 12.1 per cent of the zero parity women in the 1942 
population and to 12.6 per cent of the zero parity women in the 
hypothetical cohort. At older ages the percentages are less similar, 
being 19.4 and 25.3 respectively at age 25, and 42.2 and 62.7 at age 40. 
Such differences occur only when (a) the zero parity women of a given 
age in an actual population have had at younger ages (and prior to 
the base year) age specific birth rates which differ from those of (b) 
the zero parity women of younger ages in the actual population during 
the given base year (which are also those of the hypothetical cohort 
prior to the given age). Since it is extremely improbable that the rates 
in (a) and (b) will be identical, or that the differences will be exactly 
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TABLE 3 
PART OF THE COMPUTATION OF THE NET REPRODUCTION RATE 
FOR NATIVE WHITE WOMEN IN 1942 FROM AGE-PARITY 
SPECIFIC BIRTH RATES 
With and Without an Adjustment for Sterility and Spinsterhood 





Number of Zero Parity Women - 
Number of 
women 


having a 





Number of Cannot Can Probability Number first birth 
Exact women in Total t marry marry of having a of first anil idles 
age hypothetical and/or are and are first birth f births es 
cohort* sterile fecund (1,000 pz") (D XE) = Gem 
age§ 
A B Cc D E F G 
Assuming All Women Are Fecund and Can Marry before Age 50 
15 94 ,987 94 ,987 None 94 ,987 6.86 652 646 
16 94,899 94 ,253 None 94 ,253 19.07 1,797 1,789 
17 94,801 92 ,369 None 92 ,369 41.82 3,863 3,752 
18 94 ,694 88 ,416 None 88 ,416 69.31 6,128 6,115 
19 94,578 82,198 None 82,198 86.98 7,150 7,137 
20 94,454 74,959 None 74,959 120.44 9,028 9,013 
21 94 322 65,846 None 65,846 127.46 8,393 8,379 
22 94,181 57 ,373 None 57 ,373 138.19 7,928 7,914 
23 94 ,033 49 ,372 None 49 372 150.07 7,409 7,395 
2 93 ,879 41,897 None 41,897 129.52 5,426 5,416 
Total — — — _ — 57,774 - 
Assuming 10 Per Cent of Women Are Sterile, and All Can Marry before Age 504 
15 94,987 94 ,987 9,499 85,488 7.62 651 645 
16 94 ,899 94 ,254 9 ,490 84 ,764 21.20 1,797 1,789 
17 94,801 92,370 9,480 82,890 46.58 3,861 3,850 
18 94 ,694 88 ,419 9,469 78,950 77.53 6,121 6,108 
19 94,578 82 ,208 9,458 72,750 98.04 7,132 7,120 
20 94 ,454 74,986 9,445 65,f41 136.99 8,978 8,963 
2 94 ,322 65 ,923 9.432 56 ,491 147.09 8,309 8,295 
22 94,181 57 ,534 9,418 48,116 162.37 7,813 7,799 
23 94 ,033 49 ,648 9 ,403 40 ,245 180.02 7,245 7,232 
24 93 ,879 42 ,335 9,388 32 ,947 156.80 5,166 5,156 
Total — _ — — — 57,073 _ 
Assuming 10 Per Cent of Women Are Sterile and 10 Per Cent Cannot Marry before Age 50** 
15 94,987 94,987 18,048 76,939 8.47 652 646 
16 94,899 94,253 18,031 76,222 23.58 1,797 1,789 
17 94,801 92,369 18,012 74,357 51.90 3,859 3,848 
18 94,694 88 ,420 17 ,992 70 ,428 86.79 6,112 6 ,099 
19 94,578 82,218 17 ,970 64 ,248 110.71 7,113 7,101 
20 94 ,454 75,015 17 ,946 57 ,069 156.33 8 ,922 8,907 
21 94 ,322 66 ,008 17,921 48 ,087 170.76 8,211 8,198 
22 94,181 57 ,716 17 ,894 39 ,822 192.71 7,674 7 ,660 
23 94,033 49 ,968 17 ,866 32,102 219.43 7,044 7,031 
24 93 ,879 42,856 17 ,837 25,019 193.49 4,841 4,832 
Total — — — — — 56 ,225 — 





*1l, values from a life table computed by Scripps Foundation for Research in Population Problems: 

t At age 15 Column B equals Column A. At each subsequent age the figure for Column B is obtained 
by multiplying the number of single and ever married zero parity women of the preceding age by their 
respective survival probabilities, and deducting the women in Column G for the preceding age. 
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compensating, it is equally improbable that an allowance for complete 
sterility will not change a reproduction rate based on age-parity specific 
birth rates. 

Among the women who are sterile during part of the childbearing 
period and fecund during the remainder of the period, the fecund 
period precedes the sterile period in almost all cases, hence the discus- 
sion of partial sterility will be based on this group. For reasons ex- 
plained in connection with full time sterility, an allowance for part 
time sterility has no effect on a reproduction rate computed from age 
specific birth rates. Whether it has an effect if the base rates are age- 
parity specific depends on the type of allowance. Part time sterility 
undoubtedly increases with age because of the longer period during 
which there may be exposure to the factors causing fecund women to 
become sterile, e.g., gonorrheal infection and tumors of the generative 
organs. It so happens, however, that if the onset of sterility depends 
entirely on age, an allowance for part time sterility has no effect on 
reproduction rates computed from either age specific or age-parity 
specific birth rates. Excluding the partially sterile women reduces each 
age and parity group in the same proportion in the actual population 
as in the hypothetical cohort. In consequence the increases in the age 
specific or age-parity specific birth rates computed for the actual popu- 
lation exactly offset the decreases in the number of fecund women in 
the hypothetical cohort. 

An allowance for a relation between bearing a child and becoming 
sterile would have no effect on reproduction rates based on age specific 
birth rates, but would be almost certain to affect reproduction rates 
based on age-parity specific birth rates. The reason is that (as was 
pointed out earlier) the parity distribution at a given age of the women 
in an actual population is almost certain to differ from that of the 
women in a hypothetical cohort exposed to the rates for the actual 
population in question. The important question thus becomes: Is 
there a relation between bearing a child and becoming sterile, and if so 
what is it? Everyone knows that some of the women who become sterile 
during a given year of age do so only because of events associated 
with childbearing. It is less generally realized, but true nevertheless, 





t In computing a net reproduction rate it is assumed that the women who die before the end of the 
childbearing period have while living the same age specific birth rates as those who live to the end 
of the period. 

§ An allowance is made at each age for the higher mortality of women who bear a child at that age 
than of other women. 

{ In this section of the table Column C equals 10 per cent of Column A. 

** In this section of the table ColumnC equals 19 per cent of ColumnA. This percentage is obtained 
by multiplying the assumed percentage fecund (90) by the assumed percentage that can marry (90) 
and subtracting the product rom 100. 
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that the onset of sterility in some of the other women of that age 
might have been prevented by pregnancy and childbirth. The present 
difficulty is to estimate the extent to which childbearing causes or pre- 
vents sterility. Until more information is available on this matter it is 
not worthwhile to allow for partial sterility in computing reproduction 
rates. 

With complete sterility, in contrast, the choice lies between dis- 
regarding its effects, and estimating its incidence from the best ma- 
terial available. There is no perfect basis at present, nor is there likely 
to be one in the foreseeable future, for subdividing all women of the 
childbearing ages into two groups, one always sterile and the other 
fecund part or all of the time. It is possible, however, to set an upper 
limit on the proportion completely sterile, and thus determine the 
range within which the reproduction rates adjusted for this type of 
sterility should fall. Census tabulations show that of the native white 
married women aged 45-49 for whom the number of children ever 
born was reported, 15.7 per cent were of zero parity in 1940 and 10.0 
per cent in 1910. Allowing for the underreporting of children ever born 
reduces the 1940 figure to 12.7 per cent," and probably would reduce 
the 1910 figure to less than 8.0 per cent. Obviously not all of these 
women were childless because they themselves were sterile at all times. 
Some were childless because they were married to sterile husbands, and 
others, because they and their husbands were of low fecundity. In this 
analysis, however, such women will be included with those completely 
sterile, because they have the same effect on reproduction rates. Finally 
some of the childlessness was due to partial sterility, some to broken 
marriages, and some to contraception. 

Part of the increase in the percentage childless from 1910 to 1940 
probably was due to an increase in complete sterility. Studies of human 
fertility indicate, however, that the greater part was due to the more 
widespread and effective use of contraceptives. For this reason, it is 
unlikely that the proportion of native white married women aged 45- 
49 in 1940 who were always sterile exceeded the proportion reported 
childless in 1910, namely 10 per cent. 

Time will tell what proportion of the women under 45 in 1940 who 
marry by 45-49 will be childless when they reach that age. In view of 
the increase in childlessness between 1910 and 1940, however, it prob- 
ably will exceed 12.7 per cent, the adjusted 1940 figure. Here again it 
is probable that most of the increase will be due to the more wide- 
spread and effective use of contraceptives and relatively little to the 
greater incidence of complete sterility. It seems safe to conclude, 


11 Unpublished study by the writer. 
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therefore, that 10 per cent is a maximum estimate for complete sterility 
among women under 50 in 1940 who will marry before reaching that 
age. Little is known as to the incidence of complete sterility among 
women who do not marry before 50. The best that can be done is to 
assume that there is no relation between complete sterility and mar- 
riage, hence that 10 per cent is the maximum estimate for these women 


also. 
4. ALLOWING FOR THE EFFECT OF SPINSTERHOOD 


If all children were borne by ever married women and none by single 
women the reasons for, and the effect of, allowing for spinsterhood 
would be the same as for sterility. Under these conditions it would 
make no difference whether age specific birth rates based on all women 
in the actual population were applied to all women in the hypothetical 
cohort, or whether age specific birth rates for women who marry in the 
actual population were applied to women who marry in the hypo- 
thetical cohort, provided that the proportion of single women who 
marry was the same at each age in the hypothetical cohort as in the 
actual population.” If the basic birth rates are specific for parity as 
well as age, however, the situation is as different for spinsterhood as it 
is for full time sterility. If all women marry and 10 per cent are com- 
pletely sterile, the highest conceivable age-parity specific rates could 
not give more than 909 first births to a hypothetical cohort of 1,000 
women living to age 50. But if 10 per cent of the women remain 
spinsters (and virgins) and 10 per cent of those who marry are com- 
pletely sterile, the corresponding upper limit for first births is reduced 
to 810. Within these upper limits the effect of allowing for spinster- 
hood depends on the extent to which the proportion of the women of 
each age in the actual population who are of zero parity differs from 
the corresponding proportion for the hypothetical cohort. Under 
mortality and fertility conditions of 1942, and assuming that 10 per 
cent of the women are completely sterile, a hypothetical cohort of 
100,000 women will have 57,073 first births by exact age 25 and 79,200 
by exact age 50. But if it is assumed also that 10 per cent of the women 
cannot marry, the hypothetical cohort will have only 56,225 first 
births by exact age 25 and 75,100 by exact age 50. (See Table 3.) The 
principle involved was discussed in connection with the allowance for 
full time sterility and needs no further attention here. 

If spinsterhood and virginity were synonymous the ideal way to 
allow for it in computing a reproduction rate would be to (a) compute 

12 Marital status is customarily disregarded in computing reproduction rates, which is equivalent 


to assuming that (a) the proportion ever married is the same at each age in the hypothetical cohort 
asin the actual population, or (b) differences in these proportions are of no consequence. 
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the probability of marrying for single women of each age in the base 
population, (b) apply these probabilities to the single women in the 
hypothetical cohort, (c) relate first births to ever married (rather 
than total) fecund zero parity women by age in the base population, 
and (d) apply the resulting probabilities to the ever married fecund 
zero parity women obtained in “b.” This is impracticable because ade- 
quate information as to the number of first marriages by color and age 
of bride is lacking for most years." 

Under the circumstances the best procedure seems to be to (a) 
assume that all single women who bear a child marry before reaching 
the end of the childbearing period (or before dying), which probably 
is very close to the truth, and (b) allow for spinsterhood in the same 
way as for complete sterility, namely, by computing birth probabilities 
for, and applying them to, the women who may be expected to marry 
before they become too old to bear a child, which may well be set at age 
50. Census reports show that approximately 10 per cent of the native 
white women aged 45-54 have been single in each census since 1890, the 
proportion varying from a high of 11.1 per cent in 1920 to a low of 
8.2 per cent in 1890. In view of the narrow fluctuation of the proportion 
around 10 per cent this figure seems the best to apply to each cohort. 


5. REPRODUCTION RATES WITH AND WITHOUT ADJUSTMENTS 
FOR PARITY, FECUNDITY, AND MARRIAGE 


Net reproduction rates for native white women in the United States 
have been computed for each year from 1920 to 1944 from age, age- 
parity, and age-parity-fecundity-marriage specific birth rates. In two 
years the age adjusted rate is the same as the age-parity adjusted rate, 
being 105 in 1941 and 114 in 1944 in either case. (See Table 4.) In 18 
years the adjustment for parity lowers the net reproduction rate by 
one or more points, the largest decreases being 11 in 1933 and 9 in 
1932. In five years the adjustment raises the rate, the largest increases 
being 4 in 1921, and 3 in 1942 and 1943. Adjusting for either sterility 
or spinsterhood reduces the age-parity adjusted rate in 14 years, and 
raises it in 9 years, the maximum decrease being 7 in 1942 and the 
maximum increase 4 in 1933. Adjusting for parity, spinsterhood, and 
sterility gives results below the conventional age adjusted rates in 
every year except 1921 when it causes no change. The largest decreases 
—seven points—occur in 1931-33. Similar statements can be made for 
the gross reproduction rate and the intrinsic rate of natural increase. 

Reproduction rates adjusted for age and parity, or for age, parity, 
sterility, and spinsterhood, show the wartime rise in fertility to have 
been somewhat larger than has been thought on the basis of the con- 


13 A complicating factor—the births that occur to single women—would necessitate the use of 
birth probabilities for single women as well as for married women. 
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REPRODUCTION RATES 515 
ventional reproduction rates. From 1940 to 1943 the age adjusted net 
reproduction rate increased 21.0 per cent (from 100 to 121), the age- 
parity adjusted rate increased 27.6 per cent (from 98 to 125), and the 
TABLE 4 


GROSS AND NET REPRODUCTION RATES AND INTRINSIC RATES OF NATURAL 
INCREASE BASED ON AGE SPECIFIC, AGE-PARITY SPECIFIC, AND AGE- 
PARITY-FECUNDITY-MARRIAGE SPECIFIC BIRTH RATES FOR 
NATIVE WHITE WOMEN, UNITED STATES 1920-44 











Age- Age- 

‘ pon — . a ” aon —_ ‘ wore 
ear . arity- ecundity- ear ; arity- ecundity- 
Adjusted Adjusted Marriage- Adjusted Adjusted Marriage- 
Adjusted Adjusted 
Gross Reproduction Rate* Intrinsic Rate of Natural Increaset 

1944 122 122 119 1944 4.70 4.87 3.88 
1943 130 134 128 1943 7.01 8.21 6.58 
1942 125 128 120 1942 5.60 6.48 4.28 
1941 113 113 110 1941 1.87 1.97 .85 
1940 108 105 104 1940 — .03 — .84 —1.23 

Net Reproduction Rate* 
1944 114 114 111 1931 103 95 96 
1943 121 125 119 1930 107 102 103 
1942 116 119 112 1929 105 98 100 
1941 105 105 102 1928 110 105 106 
1940 99+ 98 97 1927 118 115 114 
1939 96 93 92 1926 118 115 115 
1938 99 97 96 1925 123 122 121 
1937 95 91 91 1924 27 128 126 
1936 94 88 89 1923 12 123 122 
1935 96 89 91 1922 124 23 122 
1934 96 89 91 1921 133 137 133 
1933 94 83 87 1920 125 127 124 
1932 100 91 %3 





* The rate per 100 persons; a stationary population has a rate of 100. 
+t The rate per 1,000 persons; a stationary population has a rate of zero. 
age-parity-fecundity-marriage adjusted rate increased 22.7 per cent 


(from 97 to 119). 

In interpreting the reproduction rates computed on the bases under 
consideration, it should be remembered that the proportion of women 
assumed to be completely sterile (10 per cent) was chosen as a maxi- 
mum, hence that the reproduction rates based on age-parity-fecundity- 
marriage specific birth rates represent extreme values. The true values 
are between these and the rates adjusted for age and parity, but un- 
doubtedly are closer to the former than the latter. 

Although the age adjusted reproduction rates can be criticized from 
a methodological standpoint, the numerical differences between them 
and the age-parity adjusted rates or the age-parity-fecundity-marriage 
adjusted rates are not large. This is fortunate, of course, because the 
former are much simpler to compute than the latter. Furthermore, the 
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basic data required for age specific birth rates are available for a rela- 
tively large number of populations, whereas those required for age- 
parity, or age-parity-fecundity-marriage specific birth rates are avail- 
able for relatively few populations. 

Taking parity, sterility, and spinsterhood into account makes much 
larger changes in the prolificacy distribution of a hypothetical cohort 
than in its reproduction rates. The most striking difference occurs in 
the proportion of zero parity women in the hypothetical cohort of 1942. 
Childless women would constitute —8.4 per cent of the cohort" accord- 
ing to the age specific birth rates, 12.3 per cent according to age-parity 
specific birth rates, and 20.1 per cent according to age-marriage- 
fecundity-parity specific birth rates. But prolificacy distribution is a 
matter for discussion in another paper rather than here. 

Even though reproduction rates based on age-parity-fecundity- 
marriage specific birth rates for women who marry and are not com- 
pletely sterile will not be used widely in the near future the question of 
terminology may deserve some attention. It seems to the writer that 
two ideas should be kept in mind, namely, (a) the terminology used 
to date with reproduction rates has become well established and 
should be changed as little as possible, and (b) the fundamental con- 
cepts involved in the terms gross or net reproduction rate and in- 
trinsic rate of natural increase have been refined rather than changed 
basically. One possibility would be to refer to the conventional rates 
as before, e.g., “net reproduction rates,” but add the word “refined” 
in referring to the more accurate rates described in this paper, e.g., 
“refined net reproduction rates.” The chief drawback is that “refined” 
has been used in the past in referring to rates adjusted for residence, 
age, and sex,'5 and, by itself, gives no clue as to the type of refinement. 
Other single words could only give a partial clue at best. It is sug- 
gested, therefore, that the phrases “age-adjusted” or “age-parity- 
fecundity-marriage adjusted” be used as prefixes. Thus the conven- 
tional net reproduction rate used heretofore would be called the “age- 
adjusted net reproduction rate,” and the more accurate rates described 
here the “age-parity,” or the “age-parity-fecundity-marriage adjusted 
net reproduction rate.” Where space is at a premium, abbreviations 
could be used, e.g., “Age-Adj. Net Repro. Rate” and “Age-Par-Fec- 
Mar-Adj. Net Repro. Rate.” It is realized that these terms are long, 
but it is believed that the disadvantage of their length is offset by the 
fact that they are relatively self-explanatory. 

“4 There would be 1034 first births per 1000 women, hence 108.4 per cent of the women would have 


one or more births and —8.4 per cent would not bear a child. 
% United States Bureau of the Census: “Mortality Statistics, 1925,” Part II, Washington, G.P.O. 


1929, p. 28. 
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THE PROBLEM OF NON-RESPONSE IN SAMPLE SURVEYS* 


Morris H. HANSEN AND WILLIAM N. Hurwitz 


The mail questionnaire is used in a number of surveys be- 
cause of the economies involved. The principal objection to 
this method of collecting factual information is that it 
generally involves a large non-response rate, and an unknown 
bias is involved in any assumption that those responding are 
representative of the combined total of respondents and non- 
respondents. 

Personal interviews generally elicit a substantially com- 
plete response, but the cost per schedule is, of course, con- 
siderably higher than it would be for the mail questionnaire 
method. The purpose of this paper is to indicate a technique 
which combines the advantages of both procedures. 

The principle followed is to mail schedules in excess of 
the number expected to be returned, and to follow up by 
enumerating a sample of those that do not respond to the 
mail canvass. Under reasonable assumptions as to the relative 
costs of the two methods of canvass, an allocation of the sample 
can be made to mail and field canvasses. An illustration is 
given to show for a given degree of reliability, the varying 
sizes of the mailing list for different expected response rates, 
and the rate of field follow-up on the non-responses. For each 
response rate, the minimum cost of the survey is computéd; 
from this computation it is possible to determine the maxi- 
mum number of schedules to be mailed independent of the 
rate of response. Then to achieve the desired precision, the 
number to be interviewed would vary with the response rate 
actually found. 

In a mathematical appendix the general formulas are de- 
rived. 

7. MAIL questionnaire is used in a number of surveys because of the 
economies involved. The principal objection to this method of col- 
lecting factual information is that it generally involves a large non- 
response rate, and an unknown bias is involved in any assumption that 
those responding are representative of the combined total of respond- 
ents and non-respondents. Personal interviews generally elicit a sub- 
stantially complete response, but the cost per schedule is, of course, 
considerably higher than it would be for the mail questionnaire method. 
The purpose of this paper is to indicate a technique which combines the 
advantages of both procedures. 
The problem considered is to determine the number of mail question- 


naires to be sent out and the number of personal interviews to take in 


* This paper was presented at the annual meeting of the American Statistical Association on 
January 26, 1946 in Cleveland, Ohio. 
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following up non-responses to the mail questionnaire in order to attain 
the required precision at a minimum cost. The procedure outlined 
below! can be applied whatever the methods of collecting data are. For 
example, perhaps equally important as the problem of non-response in 
using mail questionnaires is the problem of call-backs in taking field 
interviews. In this latter problem the procedure to minimize cost for 
a given degree of reliability would call for taking a larger sample of first 
interviews and calling back on a fraction of “those not at home.” The 
technique presented herein makes it possible to use unbiased designs 
at a reasonable cost where the excessive cost of ordinary methods of 
follow-up has frequently led to abandoning them. 

To illustrate the principles, the simplest random sampling and esti- 
mating procedures are assumed in the early part of the text. The princi- 
ples hold however for stratified sampling (see Section (b) page 525) and 
for other methods of estimation, as where a sample is used to estimate 
rate of change between two periods (see Section (c) page 526). 

As an illustration, let us assume we want to estimate the number of 
employees in retail stores during a specified period in the State of 
Indiana. We shall assume we have a listing of all establishments having 
one or more employees, say from Social Security records, and their 
corresponding mailing addresses. A procedure sometimes followed is to 
take a sample of addresses from this list, mail out the questionnaires, 
and then depend exclusively on the mail returns for the estimate of 
number of employees for all retail stores in the State. The result of 
this procedure usually will be biased. It may be seriously so if there is 
a large rate of non-response. On the other hand, if all the addresses 
were actually visited by an enumerator, the cost of collecting the in- 
formation would be much greater. 

Suppose the cost of mailing is 10 cents per questionnaire mailed, 
and the cost of processing the returns is 40 cents per questionnaire re- 
turned. Suppose, on the other hand, that the cost of carrying through 
field interviews is $4.10 per questionnaire, and that this cost together 
with the cost of processing the field returns is $4.50 per questionnaire. 
For the cost of one field visit we could then obtain about eight mail 
questionnaires if there were a 60 per cent response and five maii ques- 
tionnaires with only a 20 per cent response rate. This does not mean 
that we should take our entire sample by mail even though for the 
fixed cost we can make the actual sample size perhaps five to eight times 
as large as it would be if all the respondents were actually visited. It is 

1 The procedure given in this paper is an adaptation of the principles of double sampling developed 


by J. Neyman, see “Contributions to the theory of sampling human populations,” Journal of the Ameri- 
can Statistical Association, Vol. 35 (1938), pp. 101-116. 
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a common fallacy to assume that the size of the sampling error de- 
creases in proportion to the square root of the number of schedules in 
the sample. Actually, in practice, this is seldom true. The sampling 
error depends on the over-all design of the sample of which the sample 
size is only one factor. 

To illustrate, assume that questionnaires are sent to a sample of 
1,000 addresses drawn at random from a complete list of 40,000 stores. 
Assume further that 500 or 50 per cent respond, and that of the 500 
non-respondents 50 or 10 per cent are visited in order to insure some 
representation of the class of those that did not respond to the mail 
questionnaire. An unbiased estimate of the total number of employees 
is obtained from this sample by computing: 


r 


z= — (mé,’ + sie’"), (1) 
n 


where 
N =40,000 =the number of addresses on the mailing list; 
n=1,000 =the number of questionnaires mailed out; 
%,'=the average number of employees per establishment for the 
stores responding to the mailed questionnaire; 
m= 500 =the number of such establishments; 
#,'’ = the average number of employees per establishment for the field 
interviews; 
s=500=the number in the sample of the 1,000 questionnaires 
originally sent out that did not respond to the mailed question- 
naire. 
It is noted that the total size of sample actually processed would be 
equal to 500+-50 or 550. 
The sample variance of 2’ if the sampling method and estimating 
procedure specified are followed is given by? 
: We N-a . + N (k — 1) S? 2 2) 
one (N — 1)n ‘i n S-1 - e 
where 
o? is the variance in the entire population between the original estab- 
lishments; 
o; is the variance among those not responding to the mailed ques- 
tionnaire; 
S is the number of establishments in the population that would not 
have responded to the mailed questionnaire had the mailed ques- 
tionnaire been sent to all establishments; 


1 See appendix for development of this formula. 
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r is the number of establishments in the sample visited in the field; 


8 tie ‘ 
and k=-, where, as already indicated, s is the number of non-respond- 
r 


ents in the sample. 


With this variance formula it can readily be shown that there are 
widely different sizes of samples which will have the same reliability 
and that a point will be reached where the size of sample alone will be 
a very poor indicator of the sampling reliability. For example, assume 

N S 


that o? =¢,? and that N and S are sufficiently large that Wl and a1 
N- a 





are approximately equal to one. Further assume that the accuracy re- 
quired is such that the average sampling error, e, would be given by a 
sample of 1,000 when the rate of response is 100 per cent. If question- 
naires were mailed to a random sample of 7% establishments and the 
response rate were 100 per cent, the variance of the total estimated 
from the sample would be given by 

N-7n 


(VN —1)n- 


72 





where go? is defined as above. Thus 


_ N — 1,000 
(N — 1)1,000° ° 


i) 





When we substitute various numerical values representing various pro- 
portions of mailed returns and field interviews for the symbols in 
formula (2) we find that there are a number of samples differing widely 
in size but each of which has the same average reliability. 

Column 5 of Table 1 shows various sample sizes each of which yields 
the specified precision in the sample estimate. For example, sending 
out 1,500 questionnaires, obtaining a total of 1,125 questionnaires 
actually processed (750 by mail and 375 by field interview) yields exact- 
ly the same sampling error as sending out 10,000 questionnaires and ob- 
taining a total of 5,263 questionnaires in the sample (5,000 by mail and 
263 by field interview). It follows that at some point it would be un- 
profitable to put money into obtaining additional mail questionnaires 
and that it would be better to spend that money on obtaining returns 
from those not responding to the mail questionnaires. 

Column 6 of Table 1 shows the total cost for each of the sample sizes 
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under the unit costs assumed in Table 1. Since the table is so con- 
structed that the varying number of schedules tabulated all lead to 
exactly the same precision in the sample estimate, it is logical to pick 
that particular sample size that would lead to a minimum cost. The 
minimum cost will be achieved when 2,000 schedules are sent out, 1,000 
of them are returned by mail (at the assumed 50 per cent response rate) 
and 333 of the non-respondents are interviewed in the field. 


TABLE 1 
SAMPLES OF DIFFERENT SIZES THAT LEAD TO SAME PRECISION OF 
RESULTS, THROUGH JOINT USE OF MAIL AND ENUMERATION 
METHODS ASSUMING A 50 PER CENT RESPONSE RATE 








Schedules 





. - . . Tabulated Cost 
(1) (2) (3) (4) (5) (6) 
1,000 500 500 500 1,000 $2,550 
1,500 750 750 375 1,125 2,138 
2,000 1,000 1,000 333 1,333 2,099 
2,500 1,250 1,250 313 1,563 2,159 
3,000 , 500 1,500 300 1,800 2,250 
4,000 2,000 2,000 286 2,286 2 ,487 
5,000 2,500 2,500 27 2,778 2,751 
10,000 5,000 5,000 263 5,263 4,184 





C: =$0.10 =Cost per questionnaire of mailing 

C:=$0.40 =Cost per questionnaire of processing returned questionnaires 

C,=$4.50 =Cost per questionnaire of both enumerating and processing those obtained by field 
interviews 

n =Number of questionnaires mailed out 

m =Number of mail respondents 

s =Number of non-respondents to mail canvass 

r =Number of field interviews among the non-respondents 


Instead of proceeding by trial and error as has been done above to 
determine the optimum numbers of schedules to mail out and to pick 
up by field interview, the optimum values of n and r can be computed 
from the following relatively simple formulas: 


n=hr{1+(k — 1)Q}, 
$ (3) 
r=) 
k 
where 
/ C3P 
k = ——--—- : (4 
V C; ~ CP 
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and 
No? 
1, Nol 
a 
P is the rate of response to the mailed questionnaire, Q@=1—P, and as 
indicated earlier, ¢ is the expected average error (standard error) to be 
tolerated in the total being estimated. Formulas (3) and (4) were cb- 


N 


tained under the assumptions that o? =o}, and V-1 = S-1 +1. The 








optimum values of n and r without these assumptions are 














itt + (h ng = 
n=? — —--— 
S-—1 oe? 
(6) 
s 
r=») 
k 
with 
N2(S — 1)o? ) C: 
bo i’ a -Ibe oe (7) 
S2(N — 1)o;? C:+C:P 


Of course, the number of mail questionnaires and field interviews re- 
quired to achieve a specified precision will vary with the response rate. 
In practice, one may not know even approximately what the response 
rate will be, whereas in order to estimate the optimum values from the 
above formulas, the approximate response rate must be known in ad- 
vance of the survey. When the response rate is not known in advance 
one still may want to design the survey so as to achieve at least a cer- 
tain specified precision at minimum cost, and at the same time to know 
about what the cost of taking the survey will be. Even under such cir- 
cumstances it is possible to determine the optimum number of schedules 
to be sent out, and the optimum number of field interviews to be taken 
from among the non-respondents. For example, instead of assuming a 
50 per cent response rate as in the above illustration, let us compute the 
optimum values of n and r for response rates varying from 10 per cent 
to 90 per cent, where we still want to achieve the same precision of re- 
sults. The optimum values, computed from formulas (3) and (4), and 
the corresponding costs are shown in Table 2. 

Column 4, Table 2 shows the cost of a survey with optimum joint use 
of mail and interview methods when the response rates are known. 
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Where the response rates are unknown, one alternative (referred to as 

Alternative 1 in the above table) which makes some use of the econo- 

mies of using the mail questionnaire is to send out 1,000 questionnaires 

and follow up on all the non-responses regardless of the response rate. 
N —1,000 


The sampling error would then be N? (V—1)1,000 








o*?=é, no matter 


what the rate of response, and is the same as that for the other alterna- 
tives presented in Tables 1 and 2. However, for any given response rate, 


TABLE 2 
SAMPLES THAT LEAD TO SAME PRECISION OF RESULTS 
THROUGH JOINT USE OF MAIL AND ENUMERATION 
METHODS, FOR VARIOUS RESPONSE RATES 


(Also comparison of the minimum costs for various response rates with the cost of 
sending out 1,000 questionnaires with a 100 per cent follow-up 
of the non-respondents.) 





Cost of Cost of 








, ” " Optimum Alternative 1 
(1) (2) (3) (4) (5) 
-10 1,714 860 $4,110 $4,190 
-20 1,989 711 3,558 3,780 
30 2,034 575 3,035 3,370 
40 1,979 451 2,544 2,960 
50 1,870 841 2 ,096 2.540 
60 1,727 245 1,690 2,140 
7 1,564 163 1,328 1,730 
80 1,386 95 1,010 1,320 
90 1,197 40 731 910 





the cost of this alternative will always be larger than the optimum. It is 
of interest to see how much more costly this procedure is than the 
optimum for various response rates. A comparison of columns (4) and 
(5) shows that the increase in cost over the optimum is smallest for the 
very low rates of response which is to be expected since not enough 
questionnaires have been received to take full advantage of the econo- 
mies of the mail questionnaire. For response rates 30 per cent or greater, 
increases in cost of from 10 per cent to 24 per cent are to be expected if 
this alternative is used. 

Where the approximate response rate is not known in advance, the 
following procedure, referred to as Alternative 2, is preferable to 
Alternative 1 described above because it generally makes more effective 
use of the economies possible through the mail questionnaire method. 
The first step is to determine the maximum number of questionnaires 
to be sent out no matter what the rate of response. The second step is 
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to determine the number to be interviewed in order to achieve the de- 
sired precision after the maximum number has been sent out and the 
rate of response is actually determined from the sample returns. Hence, 
the number to be interviewed would vary with the response rate 
actually found. 

Note from column 2 of Table 2 that for the problem we are discuss- 

ing, the maximum number to be sent out no matter what the rate of 
response, is 2,034 questionnaires. If 40 per cent responded, for example, 
then, using formula (3) we find that the number to be interviewed 
would be 448. 
t, Table 3 shows, for varying rates of response to the mailed question- 
naire, the number of field interviews to be taken to achieve the desired 
precision if 2,034 schedules are mailed, and the total cost for each of 
the response rates. The cost of the optimum that could have been used 
were the response rates known in advance is also shown. 


TABLE 3 


COMPARISON BETWEEN COST OF OPTIMUM IF RESPONSE 
RATE WERE KNOWN AND ALTERNATIVES 1 AND 2 





Cost of Cost of Cost of 








, - . Alternative 2 Alternative 1 Optimum 
(1) (2) (3) (4) (5) (6) 
-10 203 852 $4,119 $4,190 $4,110 
-20 407 710 3,561 3,780 3,558 
-30 610 575 3,035 3,370 3,035 
-40 814 448 2,545 2,960 2,544 
-50 1,017 331 2,100 2,550 2,096 
-60 1,220 227 1,713 2,140 1,690 
ov 1,424 137 1,390 1,730 1,328 
.80 1,627 66 1,151 1,320 1,010 
-90 1,831 18 1,017 910 731 





A comparison of column (4) with column (6) in Table 3 indicates 
that except for the very high response rates, the lack of any advance 
knowledge of the rates of response entailed almost no additional cost 
over the optimum values when the rates are known. When the rates 
of response are high, of course, the total cost of the survey will be small 
even though an unnecessarily large number of questionnaires had 
originally been sent out. 

Thus, it can be seen that not only can optimum values of n and r 
be found for a response rate known in advance, but an optimum pro- 
cedure can be found even where nothing is known in advance about 
the rate of response, and this procedure will produce results having at 
least the specified precision and at low cost. Of course, if the response 
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rate is known approximately in advance, the use of this information in 
determining the optimum use of mail and enumerative response will 
lead to slightly lower cost. 

Some further comments on this problem: 

(a) In actual practice a mail survey has a time limit. All schedules 
arriving before the deadline constitute the mail response and the field 
follow-up sampling ratio must be applied to all on the mailing list that 
did not respond before that date. The relatively few schedules arriving 
after that date, unless they are designated for interview, must be ex- 
cluded from the sample, in order to avoid a bias of non-response of 
the type which we are trying to eliminate. The cut-off date of course 
should be held off until the mail response is substantially completed 
in order to take full advantage of the economies of the mail question- 
naire. However, once a sample is designated for field follow-up and the 
respondent is actually interviewed in the field, the mail questionnaires 
returned (other than those designated for field follow-up) must be dis- 
carded. 

(b) The optimum procedure described earlier for simple random 
sampling can be extended to stratified random sampling. Suppose, for 
example, that the population is divided into L strata with N; establish- 
ments in each. If the costs per establishment do not differ widely be- 
tween the different strata, a simple procedure is available for achiev- 
ing economies through the joint use of the mail questionnaire method 
and field follow-up. The first step is to determine the size of sample 
required under the assumption of a 100 per cent response when alloca- 
tion of the sample to the various strata is made in accordance with the 
well-known formula? 

Nioi 
ni: = ——_ i, (8) 
p Nye: 


where 7 is the size of sample required to achieve the required accuracy, 
e. For stratified sampling 7 is approximately equal to (> N.o;)?/é. 

The procedure is then to merely use formulas (3) or (6), for each 
stratum, to obtain the values for n;, the number of mail questionnaires 
to be sent out in the 7-th stratum, and r;, the number of the non- 
respondents to the mail questionnaire to be taken for personal inter- 
view. 

This procedure is not the optimum except under special conditions, 
but will be effective except in situations where the costs differ very 


* J. Neyman, “On the two different aspects of the representative method; a method of stratified 
sampling and the method of purposive selection,” Journal of the Royal Statistical Society, New Series 
Vol. 97 (1934), pp. 558-606. 
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widely in the various strata. While the optimum value for r; is the same 
as given in (6) when the subscript 7 is attached to each of the terms, 
the optimum value for n; is more complicated than the corresponding 
value for the unstratified case. To determine the optimum values, in 
general, for stratified sampling, we first determine the optimum num- 
ber of questionnaires to be sent out (n) and this is found to be equal to 


N22 WN; 


n= af De 


®; N; — 1 


N; S; ‘\ >» ®; 
S.—-1 Obi- F — 








+2 


k; -—1 
3, = 


where 


kSjonin N; 


VCsQ(Si — 1) 


and 7 is given by 


iN; : 
Niq/ ———~«: 
(x V/ Ni—1 «) 
N; 


( + > N; — 22) 


N;-1 





The optimum allocation of the n questionnaires to strata is then equal 
to 
®; 


Ba §. 
» 


(c) A ratio (or regression) type of estimate can be used instead of 
estimate (1) and at the same time make use of the optimum procedure. 
Thus if employment figures were available from a past census, an esti- 
mate of total employment which may be more efficient than estimate 
(1) consists of applying an estimate of the change in employment 
since the census date to the known number of employees at the census 
date. If we let 

§,’ = Average number of employees at the census date, per establish- 
ment responding by mail; 

jo’’=Average number of employees at the census date, per es- 
tablishment in the field interview sample; 


Ns (10) 
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then an estimate of total employment would be 
mi,’ + si_"’ 


2’ = Y (11) 
mr’ + sje2"’ 





where 

Y =Total employment at the census date, and m, s, 4’, and 2)” 
are defined as in estimate (1). An approximation to the variance of 
this estimate can be put into the same form as the variance Gz, @X- 
cept that the population variances that appear in the expression will 
now be the approximations to the variances of ratios. Then, 


, N-—n NS? 
72 





(k — 1)o2? (12) 


1 on 


(N —1)n : n S-—1 
where 
o;? = o,? + Re,? — 2Rozy 


and og,” is the same as o? in formula (2); c,? is the corresponding variance 
of the number of employees at the census date; oz, is the covarianc® 
between z and y; and R=Z/9 is the change in employment from the 
census date. 


where 52’, o»,?, and ox2y) have the same meaning as the corresponding 
terms for o;? except that these refer to the non-respondent population. 

Hence the optimum number of schedules to be mailed and field 
interviews to be taken can be determined as before merely by sub- 
stituting o;* and o2? in formula (12) for o? and o;? respectively in formula 
(2). The optimum formulas will then be given by (6) and (7). 


APPENDIX 


We shall now derive the variance of x’ given by formula (2) and de- 
termine the optimum value of n and k, given by formulas (6) and (7). 


2 
Oz’ 


N 3 
E(z’ — z)? = BE} (mi, + sx’) — x 
n 


DY 


~ wee} th. t. 


n 





But z’ =(mz,'+sz,’)/n where Z,’=average for the s non-respondents. 
Therefore mz,’ =nz’ — sZ,’, and 
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> 
Ve 
9 4 


on = — E{n(#’ — £) + 8(#2'’ — #2’) }? 


n° 


N? 
= — | En*(#! — 8)? + 2En(2" — 2)3(#"’ — &’) 
n 


+ Es?(%2"’ —_ %o’)?}, (13) 





Now for a fixed set of n observations, the quantities 2’, s, z, and 
Z.’ are also fixed, and EZ,"’ = Z,’. 

Therefore, the middle term of (13) vanishes. The third term is equal 
to 


N? gf uci 


n? (s — 1)r 8 





Now the average or expected value over all subsets of exactly s non- 
responses is equal to 
N? s—-r S s—1 N? S 


3? —————_ _ — 2=— ¢§ (k — 1)e;? 14 
- t-te s t-t- = =6wt 81 we Oe 











where o;? is the standard deviation between elements in the population 
of non-respondents, S is the number of non-respondents in the popula- 
tion and 


8 
k=—.- 
r 


Since s varies from sample to sample, we must now take the expected 
value of (14) for all possible samples. This turns out to be 

















N? S (k —1) nS . N_ 8S (1 Los? 
n? S—1 —" sa" °° 
and since 
N 
(x1; — £)? 
N-n inl N-n 
E(z’ — #)? = ~ — o?, 
(N —1)n N (N —1)n 
. N-n N S? 
o,, = N*? ————-o° + — (k — 1)o,?. 
(N —1)n n S-1 
Let «= N? ———— o?. We wish to find optimum values of n and k 
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such that the cost of the survey will be a minimum, for a fixed error, 
e, in our estimate of z. The cost of the survey is expressible as 


C = Cn +C.Pn+C,— 


where n is the number of questionnaires sent out and C; is the cost of 
mailing per questionnaire; Pn is the expected number of responses in 
the sample and C; is the cost per mail response; Qn/k is the expected 
number of field interviews and C; is the cost per field interview. 

We then find the values of n and k which minimize the cost subject 
to the condition that o2-=¢. These values turn out to be 





7 Ji = 1)e? 1 C30 
= S*(N — 1)o,? C1: +C.P 


N-1 = 
S-1 @ 





and 








n =A} + (k- NO 


CORRECTION 


In the article, “Problems and Methods of the Sample Survey of 
Business” by Morris H. Hansen, William N. Hurwitz, and Margaret 
Gurney which appeared in the June, 1946 issue of this Journal, Formula 
5, page 186, should read 
























ON THE ECONOMIC THEORY OF COST OF LIVING 
INDEX NUMBERS 


MELVILLE J. ULMER 
U. S. Department of Commerce 

The concept of a “true” cost of living index has been care- 
fully defined in theoretical economic-statistical literature, 
but the relation between this concept and officially published 
cost of living indices (ordinarily computed by Laspeyre’s 
formula) has never been established. Theoreticians have shown 
only that L> TJ» and that [:>P, where: L is Laspeyre’s index; 
Io is the true index based on the real income level of the base 
period; J; is the true index based on the real income level of 
the given period; P is Paasche’s index. 

Further analysis shows that definite theoretical relation- 
ships may be established among the differences, L—Jo, I1—P, 
Io—J,, and L—P. These relationships make possible an estimate 
of the maximum difference likely to exist under ordinary 
circumstances between computed index numbers and theo- 
retical indices, as follows: 


98.5 
L> Iy> 100 L, and 


101.5 
ates 
P<h< 100 


DEALLY, a cost of living index ought to measure the change in money 
ene required to yield equivalent satisfaction in two or more 
situations. This is the measure obviously required most frequently for 
practical purposes. It is also the definition generally adopted by 
theoreticians.' It is not, however, the definition of those who actually 
construct index numbers. 

Unavoidably conscious of practical limitations, the makers of cost of 
living index numbers characteristically define their products purely in 
terms of the formula (usually Laspeyre’s) used for their construction. 
The relation between these index numbers actually available for use, 
and the ideal, theoretically defined measure noted above has neverthe- 
less remained persistently obscure. The users of these indices are in- 
cessantly tempted—wittingly or not—to infer a much broader applica- 
tion than the bare formula alone will allow. On the other hand, theo- 

1 For reference to the definitions employed by various writers in this field see Frisch, Ragnar, 


“Annual Survey of General Economic Theory: The Problem of Index Numbers,” Econometrica, Jan. 
1936, pp. 11-12 
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THEORY OF C OF L INDEX NUMBERS 531 
reticians warn that Laspeyre’s index is in no sense a reliable indicator 
of the true (theoretically defined) change in the cost of living.? 
The present paper is an outgrowth of this dilemma. Its object is to 
demonstrate that under certain widely applicable conditions: (1) a 
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a definite quantitative relationship may be established between Las- 
- peyre’s and the true index; (2) that from available evidence it may be 
concluded that Laspeyre’s index is a close approximation of the true 
el index; (3) that the maximum error of this approximation, likely to be 
vd encountered, may be determined. A knowledge of this relationship, and 


2 Cf. especially Staehle, H., “A Development of the Economic Theory of Price Index Numbers,” 
Review of Economic Studies, June 1935, pp. 163-188; Frisch, Ragnar, op. ctt., pp. 1-38; Haberler, 
an. Gottfried, Der Sinn der Index-Zahlen, Tiibingen, 1927; Mudgett, B. D., “The Cost of Living Index and 
Konitis’ Condition,” Econometrica, April 1945, pp. 171-181. 
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of the necessary conditions involved, should be of value not only to 
the users of cost of living index numbers but also to their makers. 


THE RELATIONSHIP BETWEEN THE TRUE COST OF LIVING INDEX 
AND LASPEYRE’S FORMULA 
Stated another way, the problem of measuring the true change in the 
cost of living consists: (1) of identifying equal real incomes in (say) two 


A 














0 i dé 1 Q 


FIGURE 2 


different situations; (2) of determining the ratio of the money values of 
these two real incomes. The result, then, would indeed be the theo- 
retically “true” cost of living index as defined above. Strictly inter- 
preted, a separate index of this kind is required for each distinguishable 











\TION 


ly to 


n the 
) two 


3 of 
e0- 
er- 


ble 








mere + 





THEORY OF C OF L INDEX NUMBERS 533 


real income level, though these need not necessarily differ numerically 
in all cases. 

It has been shown, under certain highly restrictive assumptions, that 
Laspeyre’s index and Paasche’s index yield the upper and lower limits 
respectively of the true cost of living index. Although this demon- 
stration is of no practical moment in itself—because of the narrowly 
limiting assumptions—it is a useful step in the derivation of more sig- 
nificant relations. It is described most conveniently in terms of indif- 
ference curves and expenditure lines. 

In Figure 1 let the indifference curve shown represent a part of the 
indifference map of wage-earners for two commodities, A and B. By 
definition every point on this curve represents a bundle of goods of 
equivalent satisfaction. Let Qo and Q; represent two situations on this 
indifference curve. Let ab and cd represent expenditure lines tangent to 
the points Qo and Q, respectively. These lines are of the form P,A+P,B 
=R, where A and B are the two commodities, P, and P, are their re- 
spective prices and F& equals total expenditures. The point of tangency 
of an expenditure line to an indifference curve indicates the goods 
which are purchased by consumers when they maximize their satisfac- 
tion under the given price system and the given total expenditures 
indicated by the line. 

If total money expenditures in situation Q,, in the accepted notation, 
equal Dpiqi and in Qo equal Tp0go the ratio Dy19g1/ Ppogo is the true cost 
of living index. Stated more fully, 2pi19:/ TZ pogo is the ratio of the total 
money expenditures required to yield equivalent satisfaction in the 
two situations—the definition of the true cost of living index. 

In Figure 1 let commodity A be the numéraire, so that the intercepts 
of the expenditure lines on the A-axis will equal R, the total expendi- 
tures in each case. Then if we indicate the true cost of living index by 
the symbol J, we can write: 

, we 2PM _ oc | 
LpoGo oa 


Now in the same figure the expenditure line ef is drawn parallel to 
ab, and through the point Q:. Since the the slope of an expenditure line 
is given by the ratio P,/P., the equality of slopes shows that ab and 
ef are characterized by the same price system. Moreover, since ef 
passes through Q,, its intercept, oe, indicates the total expenditures re- 
quired to buy the commodities actually purchased in Q, under the price 
system prevailing in Qo: that is, oe = Zpoq. In the same way, if gh is 
drawn parallel to cd, we can write og = Zpigo, representing total ex- 












































534 AMERICAN STATISTICAL ASSOCIATION 


penditures required to buy the commodities actually purchased in Qo, 
under the price conditions of Q,. 

From the figure, it will be noted that the following relationships 
hold,? all being necessary results of the fact that the normal form of an 
indifference curve is convex to the origin: 





0g oc ; 
—>-—»> since og > 0c 
oa oa 

oc oc : 

—<-—» since oe > oa. 
oe oa 


If we substitute for these intercepts the expenditures they represent, 
we obtain: 


Do Pigo DL pin 


= > I> 


Do Pogo Dd pog 


In other words, Laspeyre’s index is the upper limit and Paasche’s 
index is the lower limit of the true cost of living index. This relationship, 
however, has been demonstrated only for a very special case. The two 
most important underlying assumptions are that tastes as well as real 
incomes‘ remain the same in the two situations compared.® Of course, 
the assumption that only two commodities appear in consumers’ 
budgets is made here as well as elsewhere in this article for convenience 
only; it has been shown that the same relationship would hold for any 
number of commodities. 

A more useful result is obtained if the most important of these as- 
sumptions is relaxed and we permit the situations to be compared to lie 
on different indifference curves. It is obvious, however, from the defini- 
tion of a cost of living index, that if the situations to be compared are 
characterized by different real incomes, there must be two true indices 
(in the case of two situations), one based on the lower real income level 
and one based on the other. This is evident in Figure 2, where Qo and 
Q: are the two situations located on different indifference curves, while 








8 The following brief proof of the relationship Zpigo/Zpoqge >I > Epigi/EZpog: is the same as that 
given in H. Staehle, op. cit., pp. 168-169. 

4 The sterility of this assumption for practical purposes was first pointed out by H. Staehle, op. cit., 
p. 170. If it were actually known that rea! incomes were the same in the two situations compared, it 
would be unnecessary to compute Laspeyre’s and Paasche's indices; the ratio of the actual money ex- 
penditures in the two situations, 2p.q:/Zpoeqe, would give the true cost of living index exactly. 

5 It is curious to note that these assumptions are occasionally overlooked in practical work and the 
relationship Zpige/Zpoge >I > Epigi/ ZV pog:, is explicitly taken as generally applicable. See, for example, 
Henry Shavell, “Price Deflators for Consumer Commodities and Capital Equipment, 1929-1942,” 
Survey of Current Business, May 1943, p. 14. 
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ab and cd are the respective expenditure lines tangent to the indifference 
curves at these points. If gh is drawn parallel to cd and tangent to the 
indifference curve at point Qo, then point Qo would represent the 
amounts of A and B which would be consumed at the lower utility level 
at the prices prevailing in Q,. Hence, the ratio of the total expenditures 
in Qo to those in Qo would indicate one true cost of living index; that 
is, the change in money income required to provide under the price 
conditions of Q,, a real income equivalent in satisfaction to that re- 
ceived in Qo. If commodity A is again taken as numéraire, then this 
true index may be represented by: 


_ dirt _ 9 
> pogo 00 


In the same way, by drawing the tangent ef parallel to ab, it can be 
shown that a second true cost of living index, based on the real income 
level which prevailed in Q;, may be represented by: 


_ pm Ps oc 


Io 


— Li poh — oe 


These two true cost of living indices are, of course, quite distinct, 
and it has been shown that they do not necessarily lie within the limits 
of Laspeyre’s and Paasche’s indices. Nevertheless, another relationship 
between these two formulas and the two true indices may be derived, 
the practical importance of which has been hitherto overlooked. If the 
expenditure line 77 is drawn parallel to cd and through the point Qo, 
then Laspeyre’s index is represented by: 

_ Lipid ot 


— Lipoge 0a 


Similarly, by drawing kl parallel to ab through point Q;, we obtain 
Paasche’s index: 
i > Psi oc 


7 Dd Pog ok 


It is easily seen from the diagram that: 
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That is: 
b> P1970 > 2190 bs P1971 > 2191 
— and —> » or (1) 
b> Po Jo Zz Poqo p » Pod » Pod 
L>I, and I;>P. (2) 


It is of course true, as other writers have contended, that these in- 
equalities, considered alone, are of only very limited usefulness. They 
tell us only that J» is at some point below L, and J; is at some point 
above P; the relationships between LZ and P and between Jo and J; are 
entirely unrestricted. Further analysis of these inequalities, however, 
permit much more useful conclusions, including a tentative estimate 
of the actual difference, to be expected in practice, between L and Ip, 
and between P and /;, under certain conditions. 

It is first of all important to note that J» and J;, may differ for one 
reason only. They may differ only because real incomes have changed 
between the two situations compared; actual differences will occur only 
insofar as the change in real income results in altering the pattern of 
consumption between the two situations, relatively increasing (or de- 
creasing) consumption of items which have advanced most in price, 
and relatively decreasing (or increasing) consumption of items which 
have advanced least. If the change in real income itself gave rise to no 
change in the pattern of consumption—.e., if the elasticity of demand 
with respect to income were the same for all goods and services and 
equal to unity—ZJ, would equal J;. If there were no change in real 
income, of course, J) would equal J;, in any case. Let the letter k repre- 
sent the difference between J» and J. 

On the other hand, differences between L and P may occur for two 
distinct reasons. The first is the same as that which accounts for any 
difference occurring between J» and J;. If the difference k occurred be- 
tween Jy and J;, then Z and P would also differ by k, except insofar as 
this difference might be diminished or enhanced by the play of a second 
factor. Thus, if the elasticity of demand with respect to prices were zero 
for all commodities, then in the inequalities of (1) above, go would 
equal go and g, would equal qi; hence L would equal J» and P would 
equal J,;. Differences between L and P and between J» and J; would 
therefore be the same, and in each case would be due to the possible 
effects of a change in real income upon the pattern of consumption. We 
have agreed to call this difference k. 

The second factor making for differences between L and P is the pos- 
sible alteration in patterns of consumption attributable to changes in 
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relative prices. It is that factor alone which accounts for the fact that 
L must be greater than J); the numerator of L is too large because it 
assumes that consumers do not alter their consumption in response to 
relative price changes, buying relatively more of cheaper items, and 
relatively less of more expensive items. For an analogous reason the 
denominator of P is too large, and P is therefore smaller than J;. As 
already indicated, if the elasticity of demand with respect to prices 
were actually zero for all commodities, the difference between L and P 
would be k, the same as that between J» and J;. Let us represent by the 
letter e the difference which occurs between L and P as the result of 
alterations in the pattern of consumption attributable to changes in 
relative prices. Then the total difference between L and P may be repre- 
sented by: 


L-—-P=([(L—-l)+(h—-—P))+(Uo—-h) =e+k=d. (3) 


The quantity e, of course, is necessarily positive, Provided tastes 
remain unchanged (in accord with the assumption made earlier), its 
effect can be only to make L greater than P. The quantity k, however, 
may be positive or negative, and may be numerically greater or less 
than e. Hence d may be positive or negative. 

Combining relationship (3) with the inequalities in (2), however, it 
is easily demonstrated that limits may be determined for the two true 
indices which should be of some practical importance. Since J>—J,=k 
and L —P=d, we obtain from (2) by substitution: 


L> Te and I,-k>L-d 
L>I,>L—-—d+k 
L>I>L-e. (4) 


Similarly we obtain for J, the limits: 


P<i,<Pt+e. (5) 


APPLICATION OF THE RELATIONSHIP 


What does the relationship, L>I>>L—e, mean in quantitative 
terms? A precise answer to this question can be obtained only through 
direct measurement of e, a task clearly fraught with numerous sta- 
tistical obstacles. For the interpretation of changes in the cost of living 
over time, as approximated by Laspeyre’s index, however, an indirect 
approach is available which should prove satisfactory for most pur- 
poses. 
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The indirect approach is based on the fact that when k is positive, 
then d is positive and greater than e, by definition. Over a period of 
time in which the number of years characterized by a higher income 
level than that which prevailed in the base period, was roughly equal to 
the number of years characterized by a lower income level, it is probable 
that k would have a roughly equal chance of being positive or negative.® 
In any event, if the period of time were fairly long it is probable that k 
would be positive in a substantial number of cases. If the number of 
such cases were sufficiently large, then the highest value of d obtained 
during the period might be taken as an estimate of the highest value of 
d likely to be obtained in the future—an assumption which can be 
continuously tested with the passage of time. However, the highest 
value of d to be obtained can also serve as a conservative estimate of 
the maximum value likely to be obtained for e, since d>e when k>0. 

Since d is easily measured this conclusion permits an estimate of the 
maximum error involved if Z is taken as an approximation of Jo, or if 
P is taken as an approximation of J,;. Although measurements ideally 
suited to this purpose are not now available, the evidence which does 
exist, presented below, suggests that the value of d at its greatest is 
very small—probably in relative terms not more than 1} per cent. 
Hence, substituting this value in (4) and (5) above, we obtain the fol- 
lowing limits to the two true indices: 





1.5 
L>I,b>L—-—L, or 
100 
L>I>——L, and (6) 
101.5 
P <I, <——P. (7) 
100 


The existing evidence bearing on the value of d is shown in Tables 1 
and 2. Although Table 1 does not actually present values of d, it has 


* Whether k is positive or negative in any given case depends upon: (1) the direction of the change 
in real income from the base period, and (2) the elasticity of demand with respect to income for com- 
modities whose prices are relatively flexible as compared with that for commodities whose prices are 
relatively inflexible. Thus, in the case of two commodities: 


” 
1 





Dpige pie’ + pi''Qe”’ Zpriq Pi'q:’ + pri'’@ 
0 6 ee |COCO O O e bn 
ZPoge Pe'ge + pe’ ge Tpoqi Poqi + Pe qi 


But ¢ = f(p’.p’”’, U) and ¢’’ = F(p”’, p’, U), where U is an index of real income level. Hence, substituting: 





I Pi'f(pi’, ri’, Ue) + ri’ F(i"’, 21’, Uo) a ie Pi'f(pr’, pr’, Ui) + mr"’F(p1"", m1’, U1) 
. = = 
pe'f(pe’, pe’’, Us) + pe’’F(pe’’, pe’, Us) ’ Pe'f(po’, pe’, Ui) + pe’ F(pe’’, pe’, Ui) 



























































TION 


itive, 
od of 
come 
ial to 
able 
tive.® 
at k 
er of 
ined 
1e of 
n be 
hest 
e of 


the 
or if 
ally 
loes 
t is 
ent. 
fol- 


sl 
1as 


nge 
om- 


i 














THEORY OF C OF L INDEX NUMBERS 539 


supplementary interest. In this table the two measures shown are both 
fixed base (Laspeyre’s) indices. One of these measures of changes in the 
cost of living, however, is based on average purchases of wage-earners 
in 1917-1918; the other is based upon purchases in 1934-1936. Most 
notable is the fact that the percentage difference between these indices 
during the period covered is at most 1.1 per cent. The clear implica- 


TABLE 1 


PER CENT DIFFERENCE BETWEEN OLD AND CURRENT COST OF 
LIVING INDICES OF THE BUREAU OF LABOR STATISTICS 

















Indices 
1935-39 = 100 
Per cent 
Date Old Index Current Index Difference,* 
Old to 
Fixed budget priced: Fixed budget priced: Current 
1917-1918 1934-1936 
average purchases average purchases 
1935 Mar. 97.8 97.8 0 
July 97.6 97.6 0 
Oct. 98.0 98.0 0 
1936 Jan. 98.7 98.8 a 
Apr. 97.9 97.8 oh 
July 99.6 99.4 2 
Sept. 100.0 100.4 4 
Dec. 100.0 99.8 2 
1937 Mar. 101.7 101.8 m 
June 102.6 102.8 2 
Sept. 103 .2 104.3 8.2 
Dec. 102.6 103.0 4 
1938 Mar. 100.7 100.9 2 
June 101.2 100.9 3 
Sept. 100.4 100.7 3 
Dec. 100.4 100.2 2 
1939 Mar. 99.6 99.1 5 
June 99.2 98.6 .6 
Sept. 100.4 100.6 a 
Dec. 99.8 99.6 2 














* Signs ignored. 
Source: U. S. Bureau of Labor Statistics. 


tion is that even over long periods of time—in this case 17 years— 
stability in the pattern of consumption may be very great. 

Table 2 presents values of d from 1929 to 1940, but not of the kind 
ideally required for the purpose set forth in this paper. Ideally, d 
should be computed from indices measuring changes in the cost of liv- 
ing for some fairly homogeneous group sharing a roughly common 
standard of living; the cost of living index in which chief interest lies, 
of course, is that for urban wage-earners. The retail price indices of 
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Table 2, however, measure price changes only for sales in retail stores, 
and of course cover éofal sales of this kind in the United States. 
Nevertheless, in the absence of better data, the results are of some 
interest. They show notably low values of d. Clearly, the size of d is 
subject to cyclical swings, reaching maximum values regularly at the 
peaks and troughs of business cycles.? Indeed, taken in conjunction 
with Table 1, there is the suggestion that cyclical swings in real in- 
comes and relative prices may have greater effects upon the pattern 
of consumption than the secular changes over a period as long as 17 or 
TABLE 2 


VALUES OF d, BASED ON RETAIL PRICE INDICES 
OF DEPARTMENT OF COMMERCE 























Indices 
1939 = 100 
L P d 
Date® Fixed Weighted | Variable Weighted 
Index; Weights Index; Weights Expressed as 
based on based on Per cent of Lt 

Sales in } Sales in 

1939 | given year 
1929 131.8 129.9 | 1.4 
1930 124.3 123.6 6 
1931 107.0 106.9 | a 
1932 92.7 92.4 3 
1933 90.4 90.2 is 
1934 98.8 98.7 ol 
1935 102.3 102.0 -3 
1936 102.4 102.1 3 
1937 106.3 105.9 4 
1938 101.7 101.7 .0 
1939 100.0 100.0 — 
1940 101.2 101.1 * 








* Index numbers for war years subsequent to 1940 are not suited to the purpose of this table and 
therefore are omitted; see section below on “Some Economic Implications.” 

+t The values of d, when expressed as percentages of P are the same for every year except 1929. In 
1929 the value of d, when expressed as a per cent of L, is 1.44; when expressed as a per cent of P, it 
is 1.46. 

Source: U. 8. Department of Commerce. 
18 years. Nevertheless, it is most important to note that the maximum 
value of d attained during the entire period is only 1.4 per cent. It is 
on the basis of this maximum value—rounding the figure to 13 per 
cent*—that the limits to the true indices in (6) and (7) were established. 

7 Cf. the remarks of Frederick C. Mills in Black, John D. and Mudgett, Bruce D., Research in 
Agricultural Index Numbers, Social Science Research Council, Bulletin No. 10, 1938, p. 53. 


8 The value of d in 1929 when stated in relative terms with Laspeyre’s index as base is, carried 
to two places, 1.44 per cent; when stated relatively in terms of Paasche’s index, the value of d is 1.46 


per cent. 
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Thus, applying these limits, if Laspeyre’s index in any given year 
were 110, then the true index (J), based on the base period real income 
level, would lie between 110 and 108.4. Similarly, if Paasche’s index 
stood at 109, then the true index (J;), based on the given year’s real 
income level, would lie between 109 and 110.6. 

This analysis shows, therefore, that Laspeyre’s index, may indeed be 
considered an approximation of the corresponding true index, and that 
the maximum error of this approximation may be estimated. An 
analogous statement applies to Paasche’s index. Moreover, although 
the data above by no means constitute conclusive evidence,® there is a 
clear indication that the error involved in these approximations is most 
likely very small. 


SOME ECONOMIC IMPLICATIONS 


In conclusion, a word concerning the economic implications of these 
findings may be in order. The principal assumption upon which the 
above analysis was based, as previously noted, is that tastes do not 
change appreciably over the situations compared. There was also the 
implicit assumption, obviously closely related, that over the time period 
considered the bulk of goods and services remain about the same in 
variety and quality. These assumptions underlie all theoretical an- 
alyses of cost of living index numbers, and it appears agreed that— 
except during periods of violent social upheaval such as war—they 
offer sufficiently closely approximations of reality provided the period 
considered is not too extended.'® 

Now it has been indicated above that, theoretically, the Laspeyre 
and Paasche index numbers would exactly equal the corresponding 
true cost of living indices if (1) there were no changes in relative prices 
from time to time, or if (2) the elasticity of demand with respect to 


* The writer has been informed by officials of the U. 8. Bureau of Labor Statistics that the Bureau 
plans to conduct periodical studies of wage-earners’ budgets—probably once a year. These data, when 
available, will make possible the construction of Paasche's index periodically, and, in turn, measure- 
ments of d suited directly to the use proposed in this article. 

10 Cf., for example, Staehle, op. cit., p. 164. Stability in the pattern of consumption over time for a 
clearly defined group such as wage-earners, as well as for the nation as a whole, has been one of the 
outstanding results of budget studies conducted in this country. This and other evidence demonstrate 
beyond question that changes in tastes are usually gradual, although accelerated by cyclical influences, 
and ordinarily affect only a few commodities even over fairly long periods. Similarly, the variety and 
quality of the bulk of all goods offered for sale ordinarily remain about the same over reasonably ex- 
tended periods. The fact that minor changes do occur rather frequently can in some measure be offset 
if the maker of the index number concerned puts to use (so far as possible) what Keynes termed the 
“method of equivalent substitution.” (See A Treatise on Money, Vol. I, 1930, pp. 103-104.) Of course, 
if it is believed that fundamental changes have occurred in either the tastes of consumers or the kinds 
of goods offered for sale, or poth, neither Laspeyre's nor any other index can be said to approximate 
the true index; from a practitioner's point of view, the continuity of the index must under these cir- 
cumstances be interrupted in order to change the composition of the base period budget priced. 
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prices were zero for all goods and services. The first condition means 
that all prices change from time to time by equal relative amounts. 
The second condition means that goods and services are consumed in 
constant proportions regardless of changes in relative prices (although 
proportions may change in response to variations in money prices and 
incomes). Obviously, neither of these conditions is tenable. 

However, it has been contended in this paper that the Laspeyre and 
Paasche index numbers are close approximations to (though they are 
not identical with) the true cost of living indices. This raises a question, 
with respect to the above conditions, of “more or less” rather than of 
“either or.” The aggregate amount of expenditures wage-earners are 
required to make in order to maintain a given plane of living is affected 
(A) by changes in money prices. It is also affected by (B) changes in 
relative prices and the adjustments in consumption patterns made by 
consumers in response to these relative price changes. The effects of factor 
(A) are measured by the true cost of living indices as well as by the 
Laspeyre and Paasche index numbers. The effects of factor (B) are 
measured only by the true indices. In economic terms, therefore, the 
findings of this paper suggest that the effects of factor (B) are ordinarily 
very small, perhaps most often negligible, when compared with the 
effects of factor (A).™ 


11 An analysis of price behavior and of consumer expenditures from this point of view should prob- 
ably bear this out. Most analyses are designed to highlight differences among commodity price and 
consumption changes, for it is these differences which provide insight into economic behavior. Actually, 
however, the broad similarities which persist over time far outweigh the differences. While relative 
prices, in detail, change constantly, the general order of relative prices remains always about the same. 
For example, to the writer’s knowledge tenderloin steak is at all times priced considerably higher than 
beef liver, even though the exact differential may and does change. Similarly, even though consumers 
change their pattern of consumption in response to prices, the persistency of common needs, habits ard 
conventions is obviously far more influential than the occasional modifications. 
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THE COMPUTATION OF PARTIAL CORRELATION 
COEFFICIENTS 


FREDERICK V. WAUGH 
Office of War Mobilization and Reconversion 


F WE have already computed the partial regression of 2; on 2, 
bix.23--- )a(.-- n, and its standard error, Sbix.23...)e(... n, 1t is a simple 
matter to compute the partial correlation between 2 and 2, 
Ti.23---)k(--- n» LO simplify the notation we shall designate the partial 
regression as b,,., its standard error by Sbx., and the partial correlation 
as Tir. 
A simple relation between these measures is 


Die. 


— (1) 
+ Vbu.? + N’S*bi. 





Tik- 





where N’ represents the number of degrees of freedom,—that is, N’ 
equals the number of observations, N, minus the number of variables, n. 
If we want the corrected partial correlation coefhcient, 7%., we simply 
replace N’ by N in the denominator of (1). 

So far as I know this formula for computing partial correlations is 
new. Certainly the computational effort is much less than that required 
by methods in common use. They are not well adapted to the computa- 
tion of partial correlation coefficients. 

We shall illustrate the use of (1) by reference to two examples chosen 
from Ezekiel’s well-known text.' On pages 469-477 Dr. Ezekiel shows 
in detail how the so-called “Doolittle Method” can be used to compute 
various measures of correlation and regression in the case of a given 
numerical problem of four variables. Using x; as the dependent variable, 
he computes the following regressions with their respective standard 
errors: 


bie.xg = — 0.810 + 0.233 
Dis-%4 = 0.180 + 0.030 
big 23 = —_ 0.390 + 0.159 


His computation of the three partial correlations requires three sepa- 
rate, and additional, applications of the Doolittle Method. If we use 
the method suggested in this paper this work is unnecessary. Using (1), 
with N =13 and N’ =13—4=9, we get 


1 Mordecai Ezekiel, Methods of Correlation Analysis, second edition, 1941. 
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— 0.810 i 
712-4 = Le 0.757 
+ +/(0.810)? + 9(0.233)? 
0.180 
713-4 = A Al 0.896 
+ »/(0.180)? + 9(0.030)? 
—0.390 
714-23 = = — 0.544. 


+ v (0.390)? + 9(0.159)? 


The above simple computations replace Ezekiel’s Tables 93 and 94 
and the computations on page 477.’ 

In the other illustration we shall compute the absolute values of 
several partial correlations when we know only the ratio, Sby./bu. 
Dividing both the numerator and denominator of (1) by bu., we have 





| ru. | = 1/1 + N’S*%,,-/bu.?- (2) 


This equation is useful because occasionally only the ratios are given 
rather than the coefficients themselves. 

A case in point is a table in Ezekiel, summarizing the results of two 
regression analyses based upon two studies of factors related to milk 
production per cow.* The first two columns of Table I show the stand- 
ard error ratios as given. From these we compute the last two columns, 
using equation (2), with NV’ =85 in the case of the Wisconsin study, and 
with N’ =65 in the case of the Minnesota study. 

To determine the signs of the partial correlations in this table we 
would need to know only the signs of the partial regressions, since the 
sign of any partial correlation coefficient is the same as the sign of the 
corresponding partial regression coefficient. 

The sixteen partial correlations in Table 1 were computed in slightly 
less than twenty minutes. It would have taken many hours to compute 
them by sixteen applications of the Doolittle Method. 

Derivation of formulas. Formulas (1) and (2) can be derived from two 
previous papers by the present writer.’ 

2 Incidentally, the partial correlations computed here do not agree with those in the first printing of 
Dr. Ezekiel's second edition. A new printing of this edition has just been made, giving the correct par- 
tials, which check with those given here. 

3 Op. cit., page 327, Table 75. We have converted the probable error ratios to standard error ratios 
by multiplying each by 1.48258. 

4F. V. Waugh, “A Simplified Method of Determining Multiple Regression Constants,” Journal 
of the American Statistical Association, Vol. 30 (1935), pp. 694-700. 


F. V. Waugh, “A Note Concerning Hote!ling’s Method of Inverting a Partitioned Matrix,” Annals 
of Mathematical Statistics, Vol. 16 (1945), pp. 216-217. 
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TABLE 1 


COMPUTATION OF ABSOLUTE VALUES OF PARTIAL CORRELATIONS BETWEEN 
MILK PRODUCTION PER COW AND SEVERAL OTHER VARIABLES 


























Relative error Absolute Values of 
Partial Correlations 
Wisconsin Minnesota Wisconsin Minnesota 
study study study study 
Total digestible nutrients .178 .170 .520 .589 
Nutritive ratio .184 .141 .508 -660 
Per cent of protein “good” .420 .250 
Per cent of lime .999 -108 
Per cent of summer feeding | .259 .386 
Per cent silage .319 .203 .322 .521 
Fat test of milk - 157 -055 .568 .914 
Per cent fall freshening .274 -17 .368 . 575 
Value per cow -397 .310 
Age of cows .265 | } .424 
Per cent grain in ratio .299 .383 





Briefly, the first of these papers showed that if the inverse of the com- 
plete moment matrix is D, and its elements are d;;, the partial correla- 
tion coefficient 


— du ' 
rn. = ——— } (3) 
+ V dudiz 
that 
dy; = 1/(my —_ bio. M2 —_ bis .™43 so a bin-™1n)} (4) 
and that 
ay: = = dibiz. (5) 


Thus, after the regression coefficients have been computed (4) and 
(5) give us two of the three values we need in (3). The missing value is 
that of dy. This value, as well as that of d;, and dy are computed by the 
procedure outlined in my 1935 paper. But the methods explained by 
Fisher and Ezekiel are based upon computing the inverse of the incom- 
plete moment matrix, obtained by eliminating the row and column 
representing the dependent variable. Fisher and Ezekiel, thus, compute 
the inverse matrix, C, the elements of which are c¢;;. 

But, when we have computed c;;, dy, and by. we can use the results 
of my 1945 paper to show that 


d;; = ¢;+ dyyb1;..b1;. (6) 


5 Both my paper and Harold Hotelling’s, “Some New Methods of Matrix Calculation,” Annals of 
Mathematical Statistics, Vol. 14 (1943), pp. 1-34, were presented as means of computing the inverse of 
& square matrix of 2p rows that was partitioned into four p-rowed matrices. But the same results can 
be obtained if any square matrix is so partitioned that the matrix labeled, a, in my paper is square. 
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It is possible to use (6) to compute the complete inverse, D. Formulas 
(1) and (2) require only the computation of the diagonal elements, 
dix. These can be written 


dik = Cre + dybx.. (7) 
Inserting in (3), the values in (5) and (6), we have 
dibi. Dix. . 
Tik- ee (8) 














q + Vdi( Cex + dyb*%.) 7 4 Chk b2,,. 
ul 
But, since 


Ckk 


S%,,.=——> 
1k N'du 


(8) can be written 
_ biz. 
+ VN’'S%b,,. + bx. 


Since writing this paper I have found that Dr. M. A. Girshick of the 
United States Department of Agriculture has worked independently 
upon the problem of increasing the order of an inverse matrix, and has 
actually computed the complete inverses of a number of matrices by 
the repeated use of (6). Dr. Girshick and I have developed simple 
work sheets for using this method to compute successively a 1-rowed, 
2-rowed, ..., n-rowed matrix, by adding a row at each step. The 
number of multiplications appears to be almost identical to the number 
required by the Doolittle Method. 
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AN APPLICATION OF SEQUENTIAL SAMPLING TO 
TESTING STUDENTS 
Duprey J. CowpEN 
University of North Carolina 


This paper provides an illustration of an application to the 
field of education of a technique hitherto used mainly in in- 
dustry. The effect of systematic stratified sampling is briefly 
discussed. 


TATISTICAL methods have recently been developed for the con- 
~ trol of quality in industry which are adaptable to the field of edu- 
cation. In industry quality control serves two purposes: (1) control of 
the process of manufacture; (2) control of the quality of accepted lots. 
In acceptance control one or more samples from each manufactured lot 
is inspected to see whether the lot should be accepted or rejected. 
It would be desirable to accept all good lots and to reject all bad lots; 
but when dealing with samples one cannot avoid making an occasional 
error of rejecting a good lot or of accepting a bad lot. Having defined 
what is meant by a good lot (say one containing one-half per cent or 
less of defective items) and a bad lot (say one containing five per cent 
or more of defective items) it is then possible to set standards of ac- 
ceptance and rejection which will give any desired degree of assurance 
against making either of the above mentioned types of error. If one is 
to select a single sample, however, and base his decision on that sample, 
the size of the sample required may be very large, and the amount of 
inspection, therefore, very large. Furthermore, for some products the 
testing is destructive, and it is highly desirable to have the sample size 
as small as is consistent with the requisite ends. Consequently, a very 
interesting procedure has been worked out, known as sequential sam- 
pling,' whereby items are tested one at a time (or in small groups) and a 
decision to accept or to reject the lot made as soon as enough data 
have been accumulated to justify one of these decisions. One never 
knows in advance how many items will be required to reach a decision, 
although an estimate can be made of the average size of sample that will 
be needed. If the quality of the lot is very good or very bad a decision 


1 The theory of sequential sampling was worked out by Abraham Wald. For the mathematical 
theory see Abrahem Wald, “Sequential Tests of Statistical Hypotheses,” The Annals of Mathematical 
Statistics, Vol. XVI, June 1945, pp. 117-186. For a shorter and less mathematical treatment see 
Abraham Wald, “Sequential Method of Sampling for Deciding between Two Courses of Action,” 
Journal of the American Statistical Association, Vol. 40, September 1945, pp. 277-306. For detailed 
instruction concerning the use of the method see Statistical Research Group, Columbia University. 
Sequential Analysis of Statistical Data: Applications, Columbia University Press, 1945. 
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will generally be reached with a small sample; if the quality of the lot 
is of an intermediate character a very large sample may be required. 

When an examination is given to a student it sometimes happens 
that not enough questions are asked to permit a fair evaluation of his 
knowledge and ability. On the other hand the examination is some- 
times drawn out longer than is necessary. If a student is very good or 
very poor only a few questions may be needed to establish this fact 
beyond reasonable doubt; but borderline students should be examined 
at considerable length before deciding whether they should be passed 
or failed. If sequential sampling is used the fate of good students and 
of poor students tends to be determined quickly, but mediocre stu- 
dents must continue with the examination until the results give ade- 
quate grounds for a decision. By use of the sequential method the 
number of questions answered by a student is reduced to a minimum, 
and at the same time the probability of passing a poor student or 
failing a good student is controlled. 

The writer recently conducted an experiment with the method of 
sequential analysis as part of a final examination in a small class in 
elementary statistics at the University of North Carolina. The method 
was used in administering a true-false examination. This examination 
consists of 200 questions, classified into ten sections according to sub- 
ject matter with a varying number of question in the different sections 
and arranged within classes according to degree of difficulty. These 200 
questions may be regarded as a sample from a much larger list of 
similar questions which could be asked. Past experience over several 
years indicates that the test scores should be graded as in Table 1. 


TABLE 1 
PER CENT OF ERRORS AND GRADES ON A TRUE-FALSE EXAMINATION 








Per cent of errors | Grade 





20 or less | A 
More than 20 but including 27 B 
More than 27 but including 32 | Cc 
More than 32 but including 35 D 
More than 35 but including 38 E 
More than 38 F 





On the basis of a particular set of questions it would be ideal to pass 
every student who could answer more than 65 per cent of all ques- 
tions of the type included in this examination, and fail every student 
who could answer less than 65 per cent. But of course a D-grade stu- 
dent might be so unfortunate as to draw a set of questions that was 
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unusually difficult for him, while an E-grade student might be lucky 
enough to pass because he drew a number of questions that were quite 
to his liking. Consequently we must make some compromises. First, 
we must decide the proportion of good students we are willing to fail 
and the proportion of poor students we are willing to pass. Also (unless 
these two proportions are to be complementary, such as .8 and .2, or 
4and .6) we cannot make a hairline distinction between a good stu- 
dent and a poor student, but must leave a zone of mediocrity into 
which some students may fall. That this is true may be seen for the 
case of students who are able (in the long run) to pass exactly 65 per 
cent of all possible questions similar to those in this examination. If 
we fail 20 per cent of these students on a given test we must of course 
pass the other 80 per cent. But if we arbitrarily decide that a good 
student is one who can answer 70 per cent of the questions, while a 
poor student is one who can answer only 60 per cent of the questions, 
the proportion of good students whom we fail need bear no relationship 
to the proportion of poor students whom we pass. The finer the dis- 
tinction we make between a good student and a poor student the 
larger the number of questions we must ask in order to have the de- 
sired assurance that the right decision will be made. 

In the present instance it was decided, largely on the basis of judg- 
ment, that we were willing to fail not more than 20 per cent of all 
students who could answer 70 per cent or more of the questions, while 
we were willing to pass not more than 10 per cent of all students who 
could answer 60 per cent or less of the questions. Thus we were willing 
to fail one out of every five C-grade students and to pass one out of 
every ten F-grade students. Let us adopt these conventional symbols: 

pi, the maximum proportion of errors in all possible questions of a 

given type made by a student who is definitely good; 

P2, the minimum proportion of errors in all possible questions of a 

given type made by a student who is definitely poor; 

a, the probability of failing a good student; 

8, the probability of passing a poor student. 

The numerical values which we have determined, mainly subjectively, 
for these constants are: 


y= .30 a= .20 
po = .40 8B = .10 


It may seem outrageous to the reader that we should be willing to fail 
one of every five C-grade students. In this case, however, an injustice 
was not inflicted, since at the end of each “round” of 20 questions the 
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student was given the option (1) of accepting his grade, providing the 
results were determinate, or (2) of continuing with the examination. 
It was more important not to pass very poor students, since such stu- 
dents would not usually exercise their option of continuing with an 
examination which they had already passed. Also it must be remem- 
bered that the true-false questions constituted only half of the fina! 
examination; the other half consisted of problems. Finally, the grade 
in the course was only partly determined by the grade on the final 
examination; five one-hour quizzes were given during the term.’ 
With respect to the true-false test, if the results were indeterminate at 
the end of 20 questions the student was required to take 20 more, 
making 40 altogether. If the results were still indeterminate, 20 more 
were answered, for a total of 60 questions. Nostudent was required to 
continue beyond 60 questions, nor permitted to continue beyond 100 
questions. Some upper limit was necessary because the period of time 
allotted to the examination was limite. 

Using D, (decision number 1) to indicate the number of questions 
that can be missed and still permit a student to pass, D» (decision 
number 2) to indicate the number of questions that must be answered 
wrong before a student is failed, and N to indicate the cumulative 
number of questions answered, the method of sequential analysis has 
been worked out in such a way that two linear equations result: 


dD; = dQ) = bN; 
Dz = a2 + ON. 


As can be seen, the straight lines representing these two equations are 
parallel, and differ only as to the constants a; and a2. These con- 
stants depend on the values of pi, po, a, and 6 adopted. The more 
widely p; and pe differ the closer together the two lines will be, and 
therefore the more quickly will a decision be reached. The larger the 
value of a or 8 the smaller will be the value of a2, and the larger 
(algebraically) will be the value of a;. Therefore to bring the two lines 
closer together we must increase a and/or 8. a is always negative, 
since answering all questions correctly does not very strongly indicate 
knowledge of the subject until a reasonable number of questions are 
answered (what is a reasonable number depends on the value adopted 


2 We thus have seven observations on the ability of each student. In case the average or the range 
of grades for any student is too far out of line in comparison with the rest of the class, this student 
may be considered “out of control”; and his case may be further considered (as by being given a condi- 
tion in the course). In industry a “control chart” is generally used for a similar purpose. For e simple 
exposition of the contro] chart method see American Standards Association, Control Chart Method of 
Controlling Quality During Production (ASA Z1.3-1942), New York, 1942. 
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for 8, becoming larger as B is made smaller). On the other hand, az 
is always positive; but a decision to fail cannot be reached until 
D:<N, since a student cannot miss more questions than he answers. 
when a=, a2=—a,. The slope b is independent of a and 8, but de- 
pends exclusively on p; and pz. b is not only the slope of the two lines, 
but it is the limiting value for per cent of errors for D, and D2 as N 
approaches infinity (see Columns 3 and 6 of Table 2). In the present 
instance the values (to two digits) of p; and p2 were chosen so as to 
make the value of b, which is the dividing line between passing and 
failing, .35 when rounded to two digits. 

The constants a), a2, and b are easily computed after certain inter- 
mediate values gi, g2 h; and hz have been obtained. These intermediate 
values are also useful for other purposes, which will be explained. The 
formulas are given below, and also their application to the present 
illustration: 


P2 4 
g. = log — = log — = .124938; 
Pi a 




















1 -— 7 
ge = log = log — = .066947; 
1 —- Pe2 6 
l—a 8 
log — log 
b= = — = 4.706 = — a); 
a1 + ge .191885 
1-8 9 
log log — 
a 2 
h2 = = — = 3.404 = ay; 
fi + Je .191885 
J2 .066947 
b = = .3489; 


7 g.+g2 .191885 
D, = — 4.706 + .3489N; 
D,= 3.404 + .3489N. 


Applying these equations we obtain the results shown in Table 2. 
Each value recorded for D, is the next integer smaller than the num- 
ber obtained by the formula, while for Dz it is the next integer larger. 
In this table decision numbers are shown only for selected values of N. 
Actually it would be possible to fail a student on the basis of six 
questions if he misses all of them, or to pass him on the basis of 14 
questions all answered correctly. The grades shown are those of Table 1. 
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In case the examination ends before a statistical decision is reached 
within the framework of the sequential sampling plan, the student is 


given the grade indicate by Table 1. 


TABLE 2 


DECISION NUMBERS AND GRADES FOR SELECTED 
NUMBERS OF QUESTIONS 








Number of 


| 





] Number ot | 














Number of || _ errors errors 
: } : | Per cent ; Per cent 
Questions | permitted | Grade || required f ‘ 
| for passing | of errors | for failing* | ae 
N l D, | 1 D: 
20 | 2 10.0 | A 11 55.0 
| 9 22.5 BC is | 45.0 
60 16 26.7 B 25 | 41.7 
80 23 28.8 | Cc 32 | 40.0 
100 | 30 30.0 | Cc | 39 | 39.0 
200 ! 65 32.5 | D 73 | 36.5 
1,000 | 344 34.4 | D |} 353 35.3 
2 — 34.89 | D 34.89 





Grade 


* If an extremely large number of questions were answered an “unfavorable” decision could con- 
ceivably be reached which would barely pass a student, since the proportion of errors appreaches 
b( =.3489) as N approaches «. By selecting a slightly higher value for p; and/or p: a value of exactly 
.35 could be obtained for 6, and an “unfavorable” decision would always fail a student. For instance, if 


Pi =.3022 and p:=.4, or p: = .3 and p: =.4023, b =.3500. 


TABLE 3 
RESULTS OF A TRUE-FALSE EXAMINATION USING THE 
SEQUENTIAL METHOD OF SAMPLING 
(After each name is indicated the number of errors, and the grade if a 
decision is reached) 








60 






















20 40 80 ; 
Final 
D: » 9 16 23 | Grade 
Rank 
1 Carr: 6 Carr: 5(A) _ — | Carr: A 
2 Horn: 5 Holt: 11 | Holt: 16(B) | Holt: B 
3 Abel: 6 Horn: 11 Abel: 17 Abel: ¢ 
4 Gill: 7 Swan: 11 Horn: 18 | Horn: © 
5 Hart: 7 Abel: 12 Swan: 18 Swan: © 
6 Holt: 7 sill: 13 Ford: 19 | Ford: C€ 
7 Swan: 7 | Hart: 14 Gill: 20 | | Gill: © 
8 Lamb: 8 } Ford: 16 Lamb: 23 Lamb: 28 | Lamb: D 
9 Ford: 9 | Lamb: 16 Hart: 25 (F) Hart: F 
10 Sage: 11 (F) _ — Sage: 
Ds 11 18 25 32 — 





In the present case students were required to answer 20 questions 
each “round.” The sets of 20 questions were not strictly random 


samples. For round one, all questions with numbers ending in 5 were 
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answered ; for round 2 all questions with numbers ending in 0; and so 
on. It seems plausible that the effect of this method of sampling is to 
reduce the sampling variability, and therefore to make the probability 
of failing a good student somewhat less than a, and the probability of 
passing a poor student somewhat less than £. 

The results of the experiment are shown (using fictitious names) in 
Table 3. Both of the students who went out on the unfavorable side 
decided to continue with the examination. One of them continued 
through 100 questions, but ended up with more than Dz errors. The 
other voluntarily terminated the examination with 80 questions and 
managed to get barely into the indeterminate zone, but not far enough 
to avoid a grade of F. (Table 3 does not show voluntary continuations 
after a decision has been reached by sequential analysis.) 

Two questions remain to be answered. (1) How many questions will 
be required, on the average, to reach a decision? (2) How close to ideal 
are the results for this particular sampling scheme? 

The number of questions that tend to be required depends on the 
caliber of the student. Mediocre students will tend to require more 
questions than very good or very poor students. The expected number 
of question NW for each of five different types of student can be ascer- 
tained easily by use of approximate formulas. This is done below. 





Proportion of questions Average number of questions 
that will be missed in required 
populations of questions 
Pp N 
— & 4.706 
0.0 No=—=—- =13.5 
b = .3489 
-_ l—a)hi—ah, .8(4.706) —.2(3.404 
n=.3 RO i en tn oan 
b—p 3489 —.3 


b =.3489 Ns =NoM1 = (13.488) (5.228) = 70.5 
_  (1—8)h2—Bh,— .9(3.404) —.1(4.706) 








9 =: .4 N +) | 7 
” pr—b .4—.3489 
_ he 3.404 
1.0 Vi=——= =5.2 
1—b .6511 


The next integer larger than Np is the smallest sample size possible for 
passing a student, while the next integer larger than N, is the smallest 
possible sample size for failing him. Since most of the students will 
tend to be C-grade, and will miss about 30 per cent of the questions, 
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about 63 questions should be required on the average. In the present 
instance it appears from Table 3 that the average number of ques- 
tions required for a decision is somewhat larger than indicated by the 
formulas. It seems to the writer that this is at least partly accounted 
for by the non-random character of the sample, which should have the 
effect of reducing the sampling variability.* The average number of 
questions required reaches a maximum at a value for p that is close to 
b, N,=71 in this case. In general it is advisable that the teacher be 
prepared to administer at least 2, questions. If attainment of the 
desired objectives is possible only by use of more questions than it is 
practical to administer, one must revise his objectives. The number of 
questions required can be reduced by adopting values of pi, p2, a, and 8 
so as to bring the values of a; and az closer together. Methods of ac- 
complishing this result have already been given. 

Comparison of the results of a given plan with ideal can be accom- 
plished by plotting an operating characteristic curve. Such a curve is 
shown in the accompanying chart. The horizontal axis shows p the 
proportion of questions which the student tends to miss in the long 
run. This is roughly one-half the questions to which he doesn’t know 
the answers, since he should guess about one-half of those correctly 
anyway. The vertical axis shows L(p) the probability of passing the 
examination. Four points for this curve are already known, while a 
fifth can be computed very easily. 





Pp L(p) 
0 1.00 
Pi 1— a= .80 
he 3.404 
hy + he 4.706 + 3.404 
P2 B = .10 
1 0 


iThe true decision number lines can be brought closer together by increasing a and/or §, or by 
employing the principle of stratified sampling and thus reducing the sampling variability. If the 
sampling variability is reduced, but the D, and D; lines are not accordingly brought closer together 
too large a proportion of the points will tend to fall in the indeterminate zone between D, and D; until 
a fairly large sample is accumulated. To take an extreme illustration, where a student al ways runs true 
to form and there is therefore no sampling variability: (1) a decision will never be reached for a “border- 
line” student, while according to random sampling theory the average sample size for reaching a de- 
cision is 70.5; (2) a “good” student cannot fail, and can pass only by answering 100 questions and miss- 
ing .3 X100 =30 questions, as compared with an average sample size of 63.1 for random sampling; 
(3) a poor student cannot pass, and can fail only by answering 80 questions and missing .4 X8032 
questions, as compared with an average sample size of 50.7 for random sampling. 
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The chart shows a vertical broken line at p=.35 (which is almost the 
exact value of b). Ideally all students should be passed for whom 
p<.35, and all students should be failed for whom p>.35. The closer 
pi and pz are together, and the smaller the values of a and 8, the 
closer the curve approaches to ideal; but also the larger the average 
number of questions that will be required. 


OPERATING CHARACTERISTIC CURVE 
(pi =.3; pr =.4; a =.2; 8B =.1) 
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In administering this examination 20 true-false questions were first 
given each student. These were graded while he was working on his 
problems. When the problems were turned in, 20 more true-false 
question were given, then a third set of 20 if necessary, and so on. 
Administrative difficulties encountered indicate that it would be better 
to give (say) 40 questions on the first round, with a diminishing num- 
ber on subsequent rounds. It also seems that it would be better to 
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allow enough time for the examination to permit as many as 150 ques- 
tions to be answered by marginal students. Another point to consider 
is whether it is better to give a random sample of questions each 
round, or a systematic stratified sample such as that used in this 
experiment. Since the variability from sample to sample is presumably 
smaller for the more representative systematic stratified sample, and 
therefore, apparently, the probability is reduced of unjustifiably 
passing or failing a student, it is the writer’s opinion that a sample of 
the type selected is better. On the other hand, it might be worthwhile 
to work out some modification of the present formulas for use with 
procedures of the type we have described. 
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THE STATISTICAL SIGN TEST* 


W. J. Drxon 
University of Oregon 
A. M. Moop 
Iowa State College 


This paper presents and illustrates a simple statistical test 
for judging whether one of two materials or treatments is bet- 
ter than the other. The data to which the test is applied consist 
of paired observations on the two materials or treatments. 
The test is based on the signs of the differences between the 
pairs of observations. 

It is immaterial whether all the pairs of observations are 
comparable or not. However, when all the pairs are compar- 
able, there are more efficient tests (the ¢ test, for example) 
which take account of the magnitudes as well the signs of the 
differences. Even in this case, the simplicity of the sign test 
makes it a useful tool for a quick preliminary appraisal of the 
data. 

In this paper the results of previcusly published work on the 
sign test have been included, together with a table of signifi- 
cance levels and illustrative examples. 


INTRODUCTION 


N EXPERIMENTAL investigations, it is often desired to compare two 
| materials or treatments under various sets of conditions. Pairs of 
observations (one observation for each of the two materials or treat- 
ments) are obtained for each of the separate sets of conditions. For 
example, in comparing the yield of two hybrid lines of corn, A and B, 
one might have a few results from each of several experiments car- 
ried out under widely varying conditions. The experiments may have 
been performed on different soil types, with different fertilizers, and 
in different years with consequent variations in seasonal effects such as 
rainfall, temperature, amount of sunshine, and so forth. It is supposed 
that both lines appeared equally often in each block of each experiment 
so that the observed yields occur in pairs (one yield for each line) pro- 
duced under quite similar conditions. 

The above example illustrates the circumstances under which the 
sign test is most useful: 

(a) There are pairs of observations on two things being compared. 

* This paper is an adaptation of a memorandum submitted to the Applied Mathematics Panel by 
the Statistical Research Group, Princeton University. The Statistical Research Group operated under 


a contract with the Office of Scientific Research and Development, and was directed by the Applied 
Mathematics Panel of the National Defense Research Committee. 
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(b) Each of the two observations of a given pair arose under similar 

conditions. 

(c) The different pairs were observed under different conditions. 
This last condition generally makes the ¢ test invalid. If this were not 
the case (that is, if all the pairs of observations were comparable), the 
t test would ordinarily be employed unless there were other reasons, for 
example, obvious non-normality, for not using it. 

Even when the ¢ test is the appropriate technique many statisticians 
like to use the sign test because of its extreme simplicity. One merely 
counts the number of positive and negative differences and refers to a 
table of significance values. Frequently the question of significance 
may be settled at once by the sign test without any need for calcula- 
tions. 

It should be pointed out that, sirictly speaking, the methods of this 
paper are applicable only to the case in which no ties in paired com- 
parisons occur. In practice, however, even when ties would not occur 
if measurements were sufficiently precise, ties do occur because meas- 
urements are often made only to the nearest unit or tenth of a unit 
for example. Such ties should be included among the observations with 
half of them being counted as positive and half negative. 

Finally, it is assumed that the differences between paired observa- 
tions are independent, that is, that the outcome of one pair of obser- 
vations is in no way influenced by the outcome of any other pair. 


PROCEDURE 


Let A and B represent two materials or treatments to be compared. 
Let z and y represent measurements made on A and B. Let the num- 
ber of pairs of observations be n. The n pairs of observations and their 
differences may be denoted by: 


(x1, y1), (Xo, Y2), oy ae (Su Yn) 
and 
4 — i, Ta — Ya, ° ** > Zn —™ Yo 


The sign test is based on the signs of these differences. The letter r 
will be used to denote the number of times the less frequent sign occurs. 
If some of the differences are zero, half of them will be given a plus sign 
and half a minus sign. 

As an example of the type of data for which the sign test is appropri- 
ate, we may consider the following yields of two hybrid lines of corn 
obtained from several different experiments. In this example n=28 
and r=7, 
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If there is no difference in the yielding ability of the two lines, the 
positive and negative signs should be distributed by the binomial dis- 
tribution with p=}. The null hypothesis here is that each difference 
has a probability distribution (which need not be the same for all dif- 
ferences) with median equal to zero. This null hypothesis will obtain, 
for instance, if each difference is symmetrically distributed about a 
mean of zero, although such symmetry is not necessary. The null hy- 
pothesis will be rejected when the numbers of positive and negative 
signs differ significantly from equality. 


YIELDS OF TWO HYBRID LINES OF CORN 














Experiment Yield of Sign of aimee Yield of Sign of 
Number A B z-y Number A B | 2-y 
1 47.8 46.1 + | 4 40.8 | 41.3 | ~ 
48.6 50.1 | - | 39.8 | 40.8 | - 
47.6 48.2 | _ 42.2 | 42.0 | + 
43.0 48.6 ~_ 41.4 42.5 | - 
42.1 | 43.4 _ | 
41.0 | 42.9 ~ 5 | 38.9 | 39.1 _ 
39.0 | 39.4 | - 
2 28.9 38.6 - 37.5 | 37.3 | 
29.0 31.1 - | 
27.4 28.0 - 6 36.8 | 37.5 = 
28.1 27.5 + 35.9 | 37.3 - 
28.0 28.7 - 33.6 | 34.0 | _ 
28.3 28.8 - 
26.4 26.3 + 7 39.2 40.1 | _ 
26.8 26.1 39.1 42.6 | _ 
3 33.3 32.4 4 
30.6 31.7 - | 























Table 1 gives the critical values of r for the 1, 5, 10, and 25 per cent 
levels of significance. A discussion of how these values are computed 
may be found in the appendix. A value of r less than or equal to that 
in the table is significant at the given per cent level. 

Thus in the example above where n=28 and r=7, there is sig- 
nificance at the 5% level, as shown by Table 1. That is, the chances 
are only 1 in 20 of obtaining a value of r equal to or less than 8 when 
there is no real difference in the yields of the two lines of corn. It is 
concluded, therefore, at the 5% level of significance, that the two lines 
have different yields. 

In general, there are no values of r which correspond exactly to the 
levels of significance 1, 5, 10, 25 per cent. The values given are such 
that they result in a level of significance as close as possible to, but 
not exceeding 1, 5, 10, 25 per cent. Thus, the test is a little more strict, 



















































eon 


oO 


11 
12 
13 
14 
15 


16 
17 
18 
19 
20 


30 


31 
32 
33 
34 
35 


36 
37 
38 
39 
40 


41 
42 
43 


45 


46 
47 
48 
49 
50 



































.5752 


TABLE 1 
TABLE OF CRITICAL VALUES OF r FOR THE SIGN TEST 
Per Cent Level of | Per Cent Level of 
Significance | | Significance 
1 5 10 25 | n | 1 5 10 25 
| | 
51s 5 18 19 20 
52 16 18 19 21 
o || 53 16 18 20 21 
- — 0 | 54 17 19 20 22 
- - 0 0 | 85 | 17 19 20 22 
— 0 0 1 | 56 | 17 20 21 23 
— 0 0 1 57 | 18 20 21 23 
0 0 1 1 || 58 18 2 22 24 
0 1 1 2 || 59 19 21 22 24 
0 1 1 2 | 60 19 2 23 25 
| | 
0 1 2 3 || 61 | 20 22 23 25 
1 2 2 3 || 62 20 22 24 2 
1 2 3 3 63 20 23 24 26 
1 2 3 4 64 2 23 24 26 
2 3 3 4 65 2 24 25 27 
| 
2 3 4 5 ] 66 | 22 24 25 27 
2 4 4 5 || 67 22 25 26 28 
3 4 5 6 68 22 25 26 28 
3 4 5 6 69 | 23 25 27 29 
3 5 5 6 70 23 26 27 29 
4 5 6 7 71 | 24 26 28 30 
4 5 6 7 72 «| 2 27 28 30 
4 6 7 8 73 | 25 27 28 31 
5 6 7 8 74 =| 25 28 29 31 
5 7 7 9 | 75 | 25 28 29 32 
| 
| 
6 7 8 9 | 76 2 28 30 32 
6 7 8 10 || 7 26 29 30 32 
6 8 9 10 || 78 27 29 31 33 
7 8 9 10 79 | 27 30 31 33 
7 9 10 11 so | 28 30 32 34 
7 9 10 11 || 81 28 31 32 34 
8 9 10 12 || 82 28 31 33 35 
8 10 11 12 83 | 29 32 33 35 
9 10 11 i3 || 84 | 29 32 3 36 
9 11 12 13 | 85 30 32 34 36 
9 11 12 14 | 86 30 33 34 37 
10 12 13 14 | 7 31 33 35 37 
10 12 13 14 || 88 31 34 35 38 
11 12 13 15 || 89 31 34 36 38 
11 13 14 15 | 90 32 35 36 39 
11 13 14 16 || ot | 32 35 37 39 
12 14 15 16 || 92 3: 36 37 39 
12 14 15 17 || 93 | 33 36 38 40 
13 15 16 17 | 94 | 34 37 38 40 
13 15 16 18 || 95 34 37 38 41 
13 15 16 18 96 34 37 39 41 
14 16 17 19 || 97 35 38 39 2 
14 16 17 19 || 98 35 38 40 42 
15 17 18 19 «|| 99 36 39 40 43 
15 17 18 20 |} 100 26 39 41 43 
For n > 100, approximate values of r may be found by taking the nearest integerless than 4n —ky n, 
where k =1.3, 1, .82, .58 for the 1, 5,10, 25 per cent values respectively. A closer approximation to the 
values of r is obtained from }(n —1) —k ¥n+1 and the more exact values of k, 1.2879, .9800, .8224, 
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on the average, than the level of significance which is indicated. For 
small samples the test is considerably more st*ict in some cases. For 
example, the value of r for n = 12 for the 10 per cent level of significance 
actually corresponds to a per cent level less than 5. 

The critical values of r in Table 1 for the various levels of significance 
were computed for the cases where either the +’s or —’s occur a sig- 
nificantly small number of times. Sometimes the interest may be in 
only one of the signs. For example, in testing two treatments, A and B, 
A may be identical with B except for certain additions which can only 
have the effect of improving B. In this case one would be interested 
only in whether the deficiency of minus signs (for differences in the 
direction A minus B) were significant or not. In cases of this kind the 
per cent levels of significance in Table 1 would be divided by two. Thus, 
8 minus signs in a sample of 28 would correspond to the 2.5% level of 
significance. 

SIZE OF SAMPLE 


Even though there is no real difference, a sample of four or even five 
with all signs alike will occur by chance more than 5% of the time. 
Four signs alike will occur by chance 12.5% of the time and five signs 
alike will occur by chance 6.25% of the time. Therefore, at the 5% 
level of significance, it is necessary to have at least six pairs of observa- 
tions even if all signs are alike before any decision can be made. 
As in most statistical work, more reliable results are obtained from 
a larger number of observations. One would not ordinarily use the sign 
test for samples as small as 10 or 15, except for rough or preliminary 
work. 

The question may be raised as to the minimum sample size necessary 
to detect a given difference in two materials. Suppose that in an indefi- 
nitely large number of observations 30% +’s and 70% —’s are to be 
expected and that we wish the sample to be large enough to detect this 
difference at the 1% level of significance. Although no sample, how- 
ever large, will make it absolutely certain that a significant difference 
will be found, the sample size can be chosen to make the probability 
of finding a significant result as near to certainty as is desired. In 
Table 2, this probability has been chosen as 95%; the minimum 
values of n (sample size) and the corresponding critical values of r 
to insure a decision 95% of the time are given for various actual per- 
centages po and levels of significance a. 

The sign test merely measures the significance of departures from a 
50-50 distribution. If the signs are actually distributed 45-55, then 
the departure from 50-50 is not likely to be significant unless the 
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sample is quite large. Table 2 shows that if the signs are actually 
distributed 45-55, then one must take samples of 1,297 pairs in order 
to get a significant departure from a 50-50 distribution at the 5% 
level of significance. The number 1,297 is selected to give the desired 
significance 95% of the time; that is, if a large number of samples of 
1,297 each were drawn from a 45-55 distribution, then 95% of those 
samples could be expected to indicate a significant departure (at the 
5% level) from a 50-50 distribution. 


TABLE 2 


MINIMUM VALUES OF n NECESSARY TO FIND SIGNIFICANT 
DIFFERENCES 95% OF THE TIME FOR VARIOUS 
GIVEN PROPORTIONS 

















| n | r 
Pe hes | 

| a=1% 5% 10% 25% || a=1% 5% 10% 25% 
.45 (.55) 1,777 1,297 1,080 780 | 833 612 612 373 
.40 (.60) 442 327 267 193 || 198 145 119 87 
.35 (.65) 193 143 118 86 Cd 78 59 49 37 
.30 (.70) 106 79 67 47 || = 89 30 26 19 
.25 (.75) 66 49 42 32) Cs 22 17 15 12 
.20 (.80) 44 35 28 21 II 13 11 9 7 
15 (.85) 32 23 18 14 || 8 6 5 4 
10 (.90) 24 17 13 11 | 5 4 3 3 
.05 (.95) 15 12 11 6 || 2 2 2 1 














The italicized values are approximate. The maximum error is about 5 for the value of n, and 2 
for the value of r. The values of n and r for 5% were taken from MacStewart (reference 1) who gives 
a table of values of n and r for a range of confidence coefficients (the above table uses only 95%) and a 


single value a =5%. 


Of course, in practice one would not do any testing if he knew in 
advance the expected distribution of signs (that it was 45-55, for 
example). The practical significance of Table 2 is of the following 
nature: In comparing two materials one is interested in determining 
whether they are of about equal or of different value. Before the in- 
vestigation is begun, a decision must be made as to how different 
the materials must be in order to be classed as different. Expressed 
in another way, how large a difference may be tolerated in the state- 
ment that “the two materials are of about equal value?” This decision, 
together with Table 2, determines the sample size. If one is interested 
in detecting a difference so small that the signs may be distributed 
45-55, he must be prepared to take a very large sample. If, however, 
one is interested only in detecting larger differences, (for example, dif- 
ferences represented by a 70-30 distribution of signs), a smaller sample 
will suffice. 

In many investigations, the sample size can be left undetermined, 
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and only as much data accumulated as is needed to arrive at a decision. 
In such cases, the sign test could be used in conjunction with methods 
of sequential analysis. These methods provide a desired amount of 
information with the minimum amount of sampling on the average. 
A complete exposition of the theory and practice of sequential analysis 
may be found in references 3 and 4. 


MODIFICATIONS OF THE SIGN TEST 


When the data are homogeneous (measurements are comparable 
between pairs of observations), the sign test can be used to answer 
questions of the following kind: 

1. Is material A better than B by P per cent? 

2. Is material A better than B by Q units? 

The first question would be tested by increasing the measurement on 
B by P per cent and comparing the results with the measurements on 
A. Thus, let 

(21, y1), (22, Yo), (2s, Ys), ete. 


be pairs of measurements on A and B, and suppose one wished to test 
the hypothesis that the measurements, zx, on A were 5% higher than 
the measurements, y, on B. The sign test would simply be applied to 
the signs of the differences 


Y1 — 1.05yx1, 22 — 1.05y2, x3 — 1.05ys, ete. 


In the case of the second question the sign test would be applied to the 
ditferences 


a (ys + Q), zt2— (ye + Q), hee (ys + Q), ete. 


In either case, if the resulting distribution of signs is not significantly 
different from 50-50, the data are not inconsistent with a positive 
answer to the question. Usually there will be a range of values of P 
(or Q) which will produce a non-significant distribution of signs. If 
one determines such a range, using the 5% level of significance for ex- 
ample, then that range will be a 95% confidence interval for P (or Q). 

Even when the data are not homogeneous, it may be possible to 
frame questions of the above kind, or it may be possible to change the 
scales of measurement so that such questions would be meaningful. 


MATHEMATICAL APPENDIX 
A. Assumptions. Let observations on two materials or treatments A 


and B be denoted by z and y, respectively. It is assumed that for any 
pair of observations (z;, y;) there is a probability p(Q0<p<1) that 
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zri>y: (t=1, 2,---, n); p is assumed to be unknown.! It is also 
assumed that the n pairs of observations (z;, yi), ((=1, 2,---, n) 
are independent; i.e., the outcome (+ or —) for (2;, y;) is independent 
of the outcome for (2;, yi) (¢#7). 

B. The Observations. The purpose of obtaining observations (z;, y;) 
is to make an inference regarding p. The observed quantity upon which 
an inference is to be based is r, the number of +’s or —’s (whichever 
occur in fewer numbers) obtained from n paired observations (2;, y;). 
On the basis of the assumption above it follows that the probability of 
obtaining exactly r as the minimum number of +’s or —’s is: 


n n—1 
(") [p"(1 — p)""" + p**(1 — p)"] r = 0, 1, 2, ; n odd 
n—2 
r=0,1,2,--- — ; mn even 
(") i (1 )4 n 
“= “ r= —; neven. 
yn , , 2 


C. The Inference. In the sign test the hypothesis being tested is 
that p=4; in other words that the distributions of the differences 


ri—y;: (¢=1, 2,---, mn) have zero medians. For the more general 
tests discussed in Section 5, the hypothesis is that the differences 
zi—f(y:) (¢=1, 2, - - -, m) have zero medians. The function f(y) may 


be Py or Q+y (where P and Q are the constants mentioned in Section 
5) or any other function appropriate for comparison with z in the 
problem at hand. 

The hypothesis that p=} is tested by dividing the possible values of 
r into two classes and accepting or rejecting the hypothesis according 
as r falls in one or the other class. The classes are chosen so as to make 
small (say Sa) the chance of rejecting the hypothesis when it is 
true and also to make small the chance of accepting the hypothesis 
when it is untrue. It can be shown that in a certain sense, the best set 
of rejection values for r is 0, 1, - - - , R, where R depends on a and n. 
R can be determined by solving for R= maximum 7 in the inequality: 


i 1 n 
>(")(5) =I,(n—i,i+1) S }a 
j=0 \J 2 


where J, (a, b) is the incomplete beta function. Table 1 was computed 
in this way. 


1 An additional assumption is that the probability A; =B, is 0; thus the probability By > Ag is 
(1—p). 
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D. Sample Sizes. When the sample size is small the sign test is likely 
to reject the hypothesis, p=, only if p is near zero or one. If pis near, 
but not equal to 3, the test is likely to reject the hypothesis, p= 4, only 
when the sample is large. 

The sample size required to reject the hypothesis p =} at the a level 
of significance, 100\% of the time, may be determined by finding the 
largest 7 and smallest n which satisfy: 


z(3)(G) $9 


2 ("pia - pre p <i. 


j=0 \J 


and 


n and 7 are given in Table II for various values of p and a; A was taken 
to be .95 in all cases. The tabular values for 1—p are the same as those 
for p because of the symmetry of the binomial distribution. 

E. Efficiency of the Sign Test. Let z=x—y. Assume z is normally 
distributed with mean a and variance o?. The probabality of obtaining 
a + on a particular 2; is: 


1 - . 
p=— e—*“ du. 
/2t J —ale 


An estimate of p involving only the signs of z; (¢=1, 2, - - - , n) yields 
an estimate of (a/c). Cochran (reference 2) has shown that in large 
samples the variance of this estimate of (a/c) is 2rpq e/”"/n. We shall 
denote a/o by c. 

The efficiency of an estimate based on n independent observations 
is defined as the limit (as n—) of the ratio of the variance of an ef- 
ficient estimate to that of the given estimate. An efficient estimate of 
cis: 


where ¢ is Student’s t and Z=)}02z;/n. 
The variance of this estimate is 1/(n—2); thus the efficiency, E, 
of the sign test is e~°/2rpq. If c=0, then p=} and the efficiency is 
2/x =63.7%. 
The preceding discussion pertains to large values of n; for small 
values of n, the efficiency is a little better than 63.7%. Computations 
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were made for several smaller values of n, namely, for n=18, 30, 44 
pairs of observations at the 10% level of significance. It was found 
that the sign test using 18 pairs of observations is approximately 
equivalent to the ¢-test using 12 pairs of observations; for 30 pairs 
the equivalent t-test requires between 20 and 21 pairs; and for 44 
pairs the equivalent t-test requires between 28 and 29 pairs. Cochran 
shows that the efficiency of r/n for estimating c decreases as |c| 
increases. 
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TESTS OF INCREASED SEVERITY 


C. West CHURCHMAN 
University of Pennsylvania 
BENJAMIN EPSTEIN 
Coal Research Laboratory, Carnegie Institute of Technology 


Certain manufactured items must be produced of such 
quality that only a very small fraction will fail when used. In 
problems of this sort standard sampling plans are impractical 
because of the excessively large sample sizes required. In this 
paper we consider the statistical questions involved in de- 
veloping an efficient sampling plan for such problems, and 
recommend a procedure which has been found very useful in 
biological assay and in the testing of explosives. There is good 
reason to believe that the same statistical techniques can also 
be used in many other fields in which one is interested in 
studying the response of objects tested as a function of the 
strength of the stimuli to which they are subjected. 


I. INTRODUCTION 


HERE are numerous manufactured items which must be so con- 
pote that only a very small fraction of these items will fail to 
perform under certain operating conditions. For example: (a) a 
munitions plant is required to produce primers sufficiently sensitive 
to ensure that very few and preferably no primers will fail to explode 
under a specified blow; (b) an aircraft manufacturer may want to be 
certain that the strength of welds meets some specified requirement 
(e.g., one might require that only 1 in 1,000 welds should fail to meet 
shear stress of X pounds); (c) a rope manufacturer may wish to make 
a product which will be able to withstand a steady load of X pounds 
applied for Y minutes; (d) the manufacturer of electrical equipment 
may desire to make a product capable of withstanding some specified 
voltage with virtually no breakdown of the electrical insulation; (e) 
a drug manufacturer may wish to test the effectiveness of a drug in 
destroying a certain organism. 

The data arising from all problems of this type may be termed 
“Sensitivity Data.” This is a general term applied to that type of ex- 
perimental data for which the act of taking a measurement on the 
object tested either destroys the object or so changes the properties 
of the object that it can no longer be used in further testing. 

We shall call the “stimulus” that condition to which we subject the 
samples, and we shall call the stimulus at which a given object just 
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fails the “critical stimulus”; that is, for a stimulus of less intensity 
than the critical stimulus, the object will not fail, and for a stimulus of 
greater intensity than the critical stimulus, the object will fail. These 
terms are illustrated in the following examples: 


Objects Tested Type of Stimulus Critical Stimulus 

Primer Caps Blow of a certain energy Energy of blow just necessary to explode 
Welds Shear stress (in pounds) Stress just necessary to shear 

Rope Load (in pounds) Load just necessary to break 

Insulation Voltage Voltage just necessary to give breakdown 
Organisms Dosage of drug Amount just necessary to kill 


It is the purpose of this paper to give the general mathematical and 
statistical attack recently developed for problems of this nature and 
to compare this approach with other current methods. Sensitivity 
problems may be divided into two broad categories: 

I. The applied blow or stimulus or particular test condition will re- 
sult either in failure or non-failure of the specimen tested. A numerical 
value cannot be attached to the result of any test. The specimen either 
fails or passes, explodes or does not explode, breaks or does not break. 

II. The stimulus is so applied that each object tested has associated 
with it a precise stimulus number (i.e., a value of the stimulus for 
which the object responds critically). For example, to any weld X; 
there can be associated a shear stress Y;. 

In this paper we shall restrict ourselves to problems belonging to 
category I because the data associated with problems in category II 
may be readily analyzed by the use of known techniques in the ele- 
mentary theory of frequency distributions and curve fitting.! 

It will be our basic aim in this paper to show how one can deter- 
mine the distribution of stimuli for which the objects will respond 
critically, even though it is impossible to measure the exact value of 
the stimulus at which a given specimen will respond critically. For 
example, the physiologist is interested in the value of the dosage for 
which 50% of the specimens tested will survive. Yet, in practice, one 
cannot arrive at the precise value of the critical stimulus for each 
specimen by a method of successive approximations, since once an 
animal has been injected with any concentration of the drug at all, 
it can no longer be used for further tests.? 

1 As an example of a problem belonging to category II see an article by C.S. Barrett, “Quality 


Control with Sampling Inspection,” Mechanical Engineering, 64, 361-364 (1942). 
2 See Statistical Note No. 1. 
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The particular use of sensitivity data which we will consider for the 
20oment will be the problem whether or not a lot meets the requirement 
that only a small fraction (.001 or .0001 for example) fails under cer- 
tain operating conditions; the inference is to be made from a sample. 
A number of methods can be used in testing such a hypothesis. Prefer- 
ence for one method over another will be based on the principle that 
we desire the maximum of reliable information from the minimum of 
data, plus the amount of prior information we can bring to bear. 

At present there are four methods available for analyzing sensitivity 
data belonging to category I. These are: 

I(a). Tests conducted only at the stimulus required by the specifi- 
cations. The submitted lot may be judged on the basis of the behavior 
of asample at the specification stimulus; 

I(b). Tests conducted at some stimulus which subjects the sample 
to more stringent operating conditions (e.g., to heavier blows, greater 
stresses, smaller or larger doses of a drug, etc.); 

I(c). Tests conducted at two and sometimes three stimuli, all of 
them being generally different from the specification stimulus; 

I(d). Complete run-down tests, i.e., tests conducted over a dis- 
crete series of stimuli from the stimulus resulting in no failures to the 
stimulus yielding 100% failures. 


II. TESTS BELONGING TO CATEGORY I(a) 


Tests of this type are usually inefficient, since one cannot get good 
assurance that a submitted lot will have the high quality required by 
the specification, even if all members in the small sample pass the 
test. In order to get such assurance huge samples are required. 

To bring these points into bold relief, let us consider the following: 
what can be said about the quality of a lot, if a random sample of 20 
or even 100 specimens taken from the lot contains no defectives? 

The answer to this question is given by the following table* which 
gives the probability of accepting material of a given quality on the 
basis of samples of 20 or 100 if the inspection plan consists of accepting 
if no defects occur, and rejecting otherwise. (Of course, if retests are 
permitted the probability of acceptance will increase.) 

For example, the end values in the second line of Table 1 may be 
read as follows: if acceptance is based solely on no defects in 20, then 
the chances are 1 in 100 of accepting material 21% defective and 99 
in 100 of accepting material .05% defective. Thus, a sample of 20, or 


* See Statistical Note No. 2. 
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even 100, is virtually useless if one wishes to accept very few lots of a 
quality level worse than .1% defective, say. 

The following general remarks may be made about this kind of test: 
(1) its advantage is that it makes no presupposition about the lot and 


TABLE 1 


QUALITY OF LOTS THAT MAY BE ACCEPTED CORRESPONDING TO 
VARIOUS PROBABILITIES IF NO DEFECTIVES ARE 
ALLOWED IN THE SAMPLE 











Sample oie 
Size Probability of Acceptance 
-01 .05 1 5 9 .95 .99 
20 21 % 14 & 1l &% 3.5% .5% -25% -05% 
100 4.5% 2.9% 2.3% 7Q% 1% 05% 01% 





the method of inspection except random (representative) sampling; 
(2) its disadvantages are that for even a moderate degree of assurance, 
large samples are needed, and further, without additional information, 
one cannot extrapolate from the inspection test to all the many condi- 
tions to be expected in service. 


III. TESTS BELONGING TO CATEGORY I(b) 


The disadvantages of tests of type I(a) lead directly to tests falling 
into category I(b). Why should one carry out the tests at that stimulus 
level which is to yield less than .1% defective? Why not carry out the 
test under test conditions which are so severe that one may expect a 
larger percentage of the items (20% or 50%, for example) to fail the 
test? This idea is certainly not new. Any accelerated laboratory test 
accomplishes a similar purpose; test conditions are so altered that a 
lot may be graded more rapidly and with less work. 

In general, the stimulus which should give approximately 50% fail- 
ures in the lot is determined from a series of initial tests conducted on 
samples drawn from a lot which is chosen as standard.‘ A number of 
fairly large samples (in case of primer lots, 300 to 500; in case of drugs, 
perhaps 100 animals) are taken at random from the lot. Each of these 
large samples is broken up into a number of smaller sub-samples, each 
sub-sample being tested at a different stimulus level. From the ob- 
served set of p;, the fraction failing to survive at the stimulus 2,, it is 
possible to obtain an estimate of the distribution of critical stimuli. 
In particular that stimulus for which one-half of the total lot fails to 

4 There is a certain degree of arbitrariness attached to the selection of a standard lot. Presumably 


the lot designated as standard is chosen because it gives satisfactory performance and is known to 
meet the requirements of the specification. 
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survive can be estimated. For convenience this is the stimulus used in 
further tests. The principle involved in obtaining a calibration distribu- 
tion is an important one. It is the forerunner of the modern method of 
attack on sensitivity problems. When this approach is fully developed, 
it leads directly to the completely rundown in category I(d). 

One should note carefully that a fundamental distribution assump- 
tion is made when lot quality is judged purely on the basis of the results 
obtained at the stimulus giving 50% failures in the reference lot. In- 
creased severity tests carried out at a single stimulus, giving 50% fail- 
ures in the reference lot, are sensitive indicators of significant differ- 
ences among lots at the specification stimulus, only if one makes a 
certain assumption about the distribution of critical stimuli; for ex- 
ample, that the “critical” stimuli are normally distributed with a known 
standard deviation which is the same for all lots. If an assumption of 
this kind is justified, then a sensitive test for significance can be made 
even with small samples. 

In order to get some idea of the efficiency of an increased severity 
test, consider the following example: (a) we shall call acceptable any 
lot which has .1% failures or fewer at a stimulus giving .1% failures in 
a standard lot; (b) we shall call unacceptable any lot which has 1% 
failures or more at the stimulus just mentioned; (c) we shall call 
marginal those lots which give between .1% and 1% failures; (d) the 
risk of rejecting a lot having .1% failures shall be .05; (e) the risk of 
accepting a lot having 1% failures shall be .01.5 Then, it can be shown 
that even under the best possible sampling methods we shall be re- 
quired on the average to take a sample of 628 specimens for lots giving 
1% failures and a sample of 206 specimens for lots giving 1% failures 
in order to reach a decision.® 

Now, suppose that we were to translate the problem involving the 
discrimination between lots which give .1% and 1% failures respec- 
tively at the stimulus giving .1% failures in the standard lot into one 
involving a comparison carried out at a stimulus giving 50% failures 
in the standard lot, then what sample sizes should be taken in order to 
arrive at a decision with the risks stated above? 

One can easily construct an increased severity analogue to the prob- 
lem just stated. The fundamental assumption that the initial stimuli 
are normally distributed with the same standard deviation means that 
the critical stimulus causing 50% failures in a lot having .1% failures 
at the specification stimulus will cause 78% failures in a lot having 1% 

5 In statistical terminology, the producer's risk a =.05, and the consumer's risk § =.01. 


* For details of such calculations see “Sequential Analysis of Statistical Data, Applications,” 
AMP Report 30.2R, Section 2, revised. 
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failures at the specification stimulus.? Graphically the situation may 
be visualized as shown in Fig. 1, where Curve A is Distribution of 
Critical Stimuli in lots having .1% failures at the specification stimulus 
and Curve B is Distribution of Critical Stimuli in lots having 1% 
failures at the specification stimulus. Thus, a comparison between lots 
having .1% and 1% failures at the specification stimulus has been con- 
verted into one between lots having 50% and 78% failures, respective- 
ly, at the increased severity stimulus with producer’s and consumer’s 
risks equal to .05 and .01, respectively. 

It can be shown that we shall be required to take on the average a 
sample of 22 specimens for lots having 50% failures at the increased 
severity level and a sample of 18 specimens for lots having 78% failures 
at this level in order to reach a decision. This means that reliable in- 
ferences may be drawn with greatly reduced samples if the tests are 
conducted at the increased severity level in the manner described. 

The following general remarks may be made about tests in I(b): 
(1) the advantage is that it is possible to predict failures under service 
conditions with a relatively small sample; (2) the disadvantages are 
that very strong presuppositions must be made which can only be 
checked after considerable experimentation; in particular, not only 
must the form of the distribution function be known, but also the 
standard deviation. The latter is usually very difficult to estimate with 
any satisfactory degree of precision for this kind of test since even small 
errors in estimate will result in large errors in prediction. 


IV. TESTS BELONGING TO CATEGORY I(c) 


In practice it is found that not only does the value of the stimulus 
giving 50% failures vary from lot to lot, but so may the standard de- 
viation of the critical stimuli. If this condition prevails, then an in- 
creased severity test conducted at only one point will give misleading 
and erroneous results. Such tests are inadequate for detecting differ- 
ences among means due to the fact that the standard deviation may 
differ significantly from lot to lot. This will mean that the distribution 
curves associated with each lot will no longer be capable of transforma- 
tion into one another by translation alone. 

Recognition of the fact that the standard deviation of the critical 
stimuli plays a fundamental role in predicting failure under various 

7 As a consequence of the fundamental distribution assumptions, reference to the table of areas 
under the normal curve will show that if lot A has .1% failures at the specification stimulus and lot B 
has 1% failures at this stimulus, then the stimuli giving 50% failures in lots A and B will be 3.09¢ 
and 2.33¢ away, respectively, from the specification stimulus. This means that the stimulus giving 50% 


failures in lot A is separated by .76¢ from the stimulus giving 50% failures in lot B. Therefore, the 
stimulus giving 50% failures in lot A will give 78% failuresin lot B. 
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conditions has led to tests belonging to category I(c), i.e., increased 
severity tests employing two and possibly three levels. In the case 
where the test is conducted at two stimuli z; and 2», it is customary to 
choose x; as that stimulus for which about 20% of the objects tested 
may fail to pass and z2 as that stimulus for which 80% of the objects 
tested may fail to pass.* ,, the true fraction passing at stimulus 2, 
may be represented as the integral of a normal distribution from 2; 
to ©, 1.e., as: 


e 1 2;0,2 
(1) ™~= er. ed § 
z; / 210? 


The problem is to determine estimates, ~ and s, of the true mean, uy, 
and true standard deviation, o, from the experimentally determined 
values of 7 and 2 (denoted by p; and py.) at the stimuli x; and 2». 
Schematically, we have: 
p; = observed fraction passing at 2; 
a,;=true fraction passing at 2; 
~=sample estimate of the true mean critical stimulus 
yw=true mean critical stimulus 
s=sample estimate of the true standard deviation 
o=true standard deviation 
This can readily be accomplished for a normal distribution by treat- 
ing the experimentally determined p; and pz, as the true 7; and zz and 
associating with each p; a t;, where t;=2;—Z/s and where values of ¢; 
vs. p; are to be found in any standard table of normal areas (e.g., see 
H. C. Carver, “Statistical Tables,” Edwards Bros., 1941). In particular, 
we obtain the simultaneous linear equations: 


1-—Z 


t, = ——— 


~ 
i) 
II 


which may easily be solved for # and s. The solutions are: 


dt 
(3) F=2m4- a and s = + ——»> where d = 2 — 2. 
lo —_ ty te —" ty 


® Actually, the best estimates are made if the percentages are 6% and 94%, respectively, in the 
sense that the sampling variance of the standard deviation 1s thereby minimized (see J. H. Gaddum, 
“Methods of Biology Assay,” etc., Medical Rescarch Council Reports on Biological Standards, Spec. Rep 
Ser. No. 183); but for most procedures it is risky to try for these values since one is apt to obtain 0% or 
100% values, and the method cannot then be applied. Consequently, whenever possible, one should 
aim at the comprise of 20% and 80%. 
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This method for determining estimates of the parameters of the normal 
distribution is a special case of the “probit” method. It also should be 
noted that as a direct consequence of carrying out the test at two levels 
we can compare two lots not only as to their means, but also as to their 
standard deviation or more generally as to utke. 

There may be advantages in using a third stimulus midway between 
the other two stimuli. Such a procedure permits at least a crude check 
on the assumption of normality by requiring that the middle stimulus 
produce a p; which in turn will give a (t;, 23) which is “close” to the 
line joining the points (4, 21) and (t:, z2). To determine how closely the 
(t;, x;) should cluster about the line, one would have to run a conven- 
tional regression analysis with weighted coefficients. 

In this connection, it should be noted that if a sensitivity test is 
conducted at more than two levels, then the simple equations (2) no 
longer suffice to determine Z and s. In the more general case where one 
has more than two number pairs (p;, z;), one determines # and s from 
a least squares linear fit to the associated number pairs (i;, z;). This 
method is cumbersome, however, since the ¢; do not have equal stand- 
ard errors.°® 

The procedure for the two-stimuli test may be summarized as fol- 
lows: (1) test a given size sample at two stimuli x; and 22, preferably, 
though not necessarily, where the respective percentages are about 20 
and 80; e.g., suppose 2;=3 foot-pounds, z2=5 foot-pounds in a test 
where the stimulus is the energy of blow; (2) call the fraction unaffected 
at 21, #1, and the fraction unaffected at x2, po; e.g., say pi is .84 and po 
is .10; (3) in tables of areas of the normal curve (see any standard sta- 
tistics text), find that value of ¢ for which the area is p; (call it 4), and 
that value of ¢ for which the area is po; e.g., for p, = .84, 4 = — 1.00, and 
for po=.10, te = +1.28; (4) solve for Z and s by the equations (3) given 
above; eg., in the example cited: 


2(1) 
== 3+ — =3.88 
2.28 


2 


—— = .88; 
2.28 


* For a complete account of the application of the method to certain biological problems, see 
C. I. Bliss, “The Calculation of the Dosage Mortality Curve,” Annals of Applied Biology, 22, 134-167. 
It is to be noted that in many cases where the critical stimuli are not normally distributed the assump- 
tion of normality may be satisfied, if the stimuli are subjected to certain familiar transformations, e.g., 
the logarithmic. Also, the complexities involved in Bliss’s method may be avoided in part by employing 
distribution functions that are practically identical with the normal, or appear to fit the data better; 
e.g., the so called “growth curve”: see E. B. Wilson and S. Worcester, “The Determination of L.D. 50 
and its Sampling Error in Bio-Assay,” Proceedings National Academy of Science, 29, 1943. 
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(5) then, as a result of these calculations, one can predict that about 
nine failures in 400 will occur at +2s, about four failures in three 
thousand will occur at +38, about one failure in 100,000 will occur 
at £+4s, and about three failures in 10,000,000 will occur at +5s.'° 

The following general remarks may be made concerning tests in 
I(c): (1) the advantage is that it is possible to predict failure under 
service conditions with relatively small samples, and with no prior 
knowledge about the true mean or standard deviation of the distribu- 
tion; (2) the disadvantages are that some distribution assumption of 
critical responses must be made which must be very exact if predictions 
far out on the tails are to be valid, and some knowledge of the 20% 
and 80% failure points must be had; (complete failure or complete 
success is worthless information for this test). 


V. TESTS BELONGING TO CATEGORY I(d) 


In the previous paragraph we have given, in essence, the treatment 
of sensitivity data as developed by Bliss. This has opened the way 
directly to a discussion of problems belonging to category I(d) (i.e., 
complete run-down tests where the stimuli used range from the stimu- 
lus for which none of the objects is affected to the stimulus for which all 
of the objects tested are affected). Bliss’s method is applicable to such 
problems. It is, however, the purpose of the remaining portion of the 
paper to give an alternative method of analysis which we feel is to be 
preferred primarily because of its simplicity. The method was first pro- 
posed by Spearman in the treatment of problems arising in psycho- 
metrics and was recently generalized by the authors." We will assume 
throughout this treatment that either the critical stimuli (or some 
simple function of the critical stimuli such as the logarithm) are nor- 
mally distributed. This sort of presupposition seems to be in accordance 
with the findings of experimenters in the field of sensitivity data and is 
certainly needed if there is to be any meaning to the estimation of low 
percentage points.'® However, it can be shown that the normality of 
the universe is not a necessary condition in order that the method yield 
unbiased estimates of » and o. 

1 The estimate of a low percentage point (.1% or .00003%) is based on the presupposition that 
the universe under study is normal or nearly normal “in the tails” of the distribution. This could only 
be “tested” empirically by taking a very large sample at extreme points; and even if one sample con- 
firmed the normality hypothesis, we would not have any great confidence in subsequent samples. The 
alternatives are to give up any attempt to estimate a low percentage point, or else to make certain 
presuppositions as a basis for inquiry. The latter alternative is not only sound methodology, but is 
characteristic of all scientific inquiry; for fuller explanation, the reader is referred to C. W. Church- 
man, “Probability Theory,” Phil. of Science, 12, 147-173 (1945). 

11 C. Spearman, “The Method ‘Right and Wrong Cases’ (constant stimuli) without Gauss’ For- 


mula,” British Journal of Psychology, 2, 1908, 227-242 and B. Epstein and C. W. Churchman, “On 
the Statistics of Sensitivity Data,” Annals of Mathematical Statistics, 15, 90-96 (1944) 
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In brief, the attack on problems in category I(d) using Spearman’s 
method is based on the key assumption that if p; is the observed frac- 
tion unaffected at stimulus 2; and p;4; the observed fraction unaffected 
at stimulus 2z;4:, then pi— pis: is an estimate of the fraction just af- 
fected (i.e., the fraction of those that have “critical” responses) at 
about 3(2;+2;4:). If we assume initially that the z; are equally spaced 
and that p,=1.0 and p,=0, then the set of data on critical responses 
may be transformed into data where the z,’s are integers and the in- 
tervals are unity. In problems in the detonation of solid explosives it is 
found convenient to have the strength of blows increase linearly. In 
biological problems it is customary to choose the dosages in such a way 
that their logarithms are equally spaced. To take the z; (or log 2z;) as 
equally spaced is convenient in practice, and this furthermore leads to 
a great simplification in the computations. However, it is by no means 
essential that this condition be true. Formulae can easily be derived 
for Z, s, the higher moments, and the sampling errors, even if the 2; 
are not evenly spaced. 

In the second reference given in footnote (11), one will find a com- 
plete method for determining Z, s, the higher moments, and their 
sampling errors in terms of the 7;, the true fraction unaffected. In 
practice, of course, one rarely knows the 7; but does know the pi, the 
observed values of the fractions unaffected at stimulus z;. It is possible 
to give unbiased estimates for Z, s and the sampling variances of these 
unbiased estimates in terms of the p;, but the proofs will not be given in 
this paper. An outline of the key formulae is given in Statistical Note 
No. 3. 

The general remarks about this test are: (1) the advantage is that 
with a relatively small sample size, and simple computations, one is 
able to predict failures over a wide range of applications without 
presupposing the parameters (mean, standard deviation, etc.) of the 
distribution of critical responses; (2) the disadvantage is that for pre- 
diction purposes, the form of the distribution function of critical re- 
sponses must be presupposed, especially for predictions of very low 
percentage points. 

By making the transition from category I(a) to category I(d), a 
step of great importance has been taken. We have been able to increase 
enormously the efficiency and precision of the inferences that can be 
drawn from small samples. This is chiefly due to the fact that we are 
now able to treat the sensitivity problem in such a way that an at- 
tributes problem has been translated into one involving a continuous 
distribution. This means that even though we cannot measure the 
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critical stimuli for any of the objects under test, we can still associate 
with any lot actual numbers, namely, Z, the estimate of stimulus for 
which 50% of the objects will respond, s, the estimate of the standard 
deviation of the critical stimuli, and finally by proper choice of k, 
determine the stimulus for which any preassigned fraction of objects 
will respond or fail to respond. Knowledge of Z, s, £+ks, and their 
errors will prove very useful in comparing the qualities of two or more 
lots. Furthermore, control charts on %, s, and €+ks when applied to 
a succession of lots coming from a common source are very informative 
in detecting significant shifts in sensitivity level or the lack of control 
in individual lots. 

It will be noted that the choice of the four tests discussed here de- 
pends upon two things: the purpose of the test and the amount one is 
willing to presuppose. If the purpose of the test is to predict over a 
wide range of possible applications, tests under I(a) are the least satis- 
factory. It is to be emphasized that presuppositions must be made in 
every test procedure and the strength of the presuppositions must be 
weighed against cost and the confidence based on previous experience. 


VI. OPERATING CHARACTERISTICS OF 
TESTS OF INCREASED SEVERITY 


The discussion contained in the previous sections can be summarized 
conveniently by operating characteristic curves (OC curves). These 
curves really tell the whole story behind an inspection plan or an experi- 
mental design for they give the probabilities of acceptance (or rejec- 
tion) for any given “true” state. For the method of calculation of the 
OC curves discussed in this section see Statistical Note 4. 

As a simple example under I(a), assume that the inspection plan 
demands acceptance only if no defects occur in a sample of 20 (i.e., 
reject if one or more defects occur in the sample.) The resulting operat- 
ing characteristic curve is shown as curve A in Fig. 2; curve B repre- 
sents an identical plan except that 100 are tested instead of 20. It will 
be noted that curve B is moved to the left of curve A, which means 
that the chance of accepting bad material is always less under this plan 
(as is to be expected, since the requirement that no defect occur in 100 
is much harder to meet than the requirement that no defect occur in 
20). It will also be noted that curve B rises more sharply than curve 
A; a steep slope in the OC curve is always a desirable property for it 
means that the risks of rejecting good lots, and the risks of accepting 
bad lots are both less (provided the curve is properly “located”). Thus, 
the real advantage of an increased sample is to be found in the sharper 
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slope of the OC curve, and if this slope can be made sharper by in- 
creased severity rather than increased sampling, a great advantage is 
gained thereby. 

The gain in an increased severity test is shown in Fig. 2. Plan C con- 
sists of testing, in accordance with I(b), at a stimulus (Xo) where about 
50% of the standard product fails. It is assumed that the standard 
deviation of critical stimuli is unity and is constant for all lots; it is 
further assumed that the critical stimuli are normally distributed. 
The standard lot in this case was assumed to be one which would 
give only .05% failures at the specification stimulus. It will be noted 
that despite the fact that Plan C employs the same sample size as 
Plan B, the slope of the OC curve is much sharper for C. The risk of re- 
jecting good material is less (.023 vs. .05), and the risk of accepting bad 
material is greatly reduced (e.g., if a lot has 1% defectives, Plan C 
would practically never accept it, whereas Plan B would accept it 
about 40% of the time). 

In a large number of instances, the manufacturer is anxious that no 
defectives at all be found in the lot; of course, no sampling plan could 
ever give absolute assurance of this quality, but it may be extremely 
desirable to have a very small risk of accepting material with, say, 
.001% defectives or more. Such items are usually “critical” in some 
sense; that is, if the item is defective, complete failure of a rather costly 
manufactured piece will occur. (An excellent example is the primer or 
detonator in ammunition because these usually initiate the propellants 
and, if they fail, the entire round may be wasted.) In these cases the 
cost of the item is usually small (though not necessarily), and the manu- 
facturer is willing to increase the producer’s risk (risk of rejecting good 
material) in order to decrease the consumer’s risk to an absolute 
minimum. 

In such cases, provided we have no accurate knowledge of the varia- 
tion of the standard deviation, the schemes discussed in categories 
I(c) and I(d) are practically necessary. Suppose we want at least a 
50-50 assurance that the lot contains no more than three defects in 
10 million (.00003%). Plan D represented in Fig. 3 consists of a run- 
down test in accordance with I(d), or of a two-stimulus test in accord- 
ance with I(c). In order to make comparisons we assume the errors of 
estimate of the mean critical stimulus and of the standard deviation 
to be about .1c; this is a reasonable assumption for the error of these 
statistics in tests run at intervals of about one standard deviation, 
fifty samples at a stimulus, or in most two-stimulus tests where 100 
to 200 are tested at each stimulus. The average number tested in either 
case ranges from 200 to 400. We assume the distribution to be normal 
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and require that the quantity Z+-5s be less than the specification stimu- 
lus.!° Since estimates of + 5s are themselves approximately normally 
distributed, we will accept 50% of the lots having a true value of #+5s 
at the specification stimulus. The OC curve in this case is plotted vs. 
the log fraction failing (as was done in Fig. 2.) Plan E represents a single 
stimulus sequential sampling plan which intersects the OC curve for 
Plan D at the .05 and .95 probability levels.’? The enormous amount of 
testing that Plan E involves is shown in Fig. 4. While the testing for 
Plan D involves an average sample of from 300 to 350 for each lot, 
the sampling for Plan E might require as many as 100,000 items! 

Another important advantage of tests of increased severity, an ad- 
vantage whose importance it is difficult to over-emphasize, lies in 
another type of OC curve shown in Fig. 5, which cannot in general be 
constructed when one tests at only one stimulus. This OC curve can 
only be plotted when quantitative measurements are taken, or when, 
as in this case, qualitative judgments are transformed into quantita- 
tive. 

The OC curve of Fig. 5 is derived from Plan D and plots the proba- 
bility of acceptance against the stimulus (Xmax.) where only three 
defectives in ten million occur. It will be seen that practically no lot 
will be accepted, the Xmax. of which exceeds the “specification stimu- 
lus” by more than 1.25 standard deviations. 

This information is extremely important since it provides an exact 
basis for setting a “safety factor” for inspection tests; we always want 
the specification stimulus to be more severe than the conditions to 
which we expect the product to be subjected in service, but the ques- 
tion that always puzzles the careful specification writer is the amount of 
safety margin to introduce. For example, suppose we know that a cer- 
tain weld will practically never receive a blow in excess of 150-foot 
pounds in practice. What value should be chosen for the blow which all 
pieces are supposed to pass? We ought not to choose 150-foot pounds, 
for any sampling plan leads to errors, and we require some margin to 
take care of these errors. What is often done is to select some arbitrary 
figure, such as 200 or 250-foot pounds, on a vague guess as to the ade- 
quacy of the safety margin. If an inspection plan similar to Plan D is 
used, such arbitrary guesses can be avoided. From tests on the product 
we can estimate the highest standard deviation to be expected and 
choose a value which exceeds 150-foot pounds by more than 1.25 
standard deviations. 


12 See “Sequential Analysis of Statistical Data: Theory” and “Sequential Analysis of Statistical 
Data: Applications,” AMP Report Nos. 30.1 and 30.2R. 
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There is a difficulty of a serious nature in all uses of arbitrary safety 
margins. An arbitrarily selected margin will either seriously increase 
the rejections of perfectly good lots, or seriously increase the acceptance 
of bad. Such arbitrary margins really operate in the worst possible man- 
ner: if the lot is excellent in quality, the margin is a handicap, whereas 
for bad lots, where “safety” is most required, the margin is apt to be 
dangerously small. Thus, the need of exact procedures, where they can 
be employed, is imperative. 

The enormous saving in sample size of and the amount of information 
to be gained from an accurately designed increased severity test is 
really no more than an example of the saving introduced by any 
scientific advance. The saving indicates strongly the desirability of 
experimental investigations whose purpose is to provide adequate in- 
creased severity tests. 


VII. ILLUSTRATIONS OF THE THEORY OF 
COMPLETE RUN-DOWN TESTS 


We shall illustrate the theory treated in Statistical Note No. 3 by 
two examples. The first example will involve finding out whether or 
not a certain lot meets a specification requiring that only 1 in 1,000 
objects fail to be affected at a certain preassigned stimulus. The second 
example will involve the comparison of two lots as to possible differ- 
ences in their quality. 

It will be recalled that in order to ensure distinguishing between a 
lot which has 1% failures and another which has .1% failures at the 
specification stimulus, it was necessary to test about 600 specimens. We 
also saw that sample sizes averaging about 20 specimens are adequate, 
if the tests were conducted at the 50% increased severity level and if the 
assumption ts made that the critical stimuli for all lots are normally dis- 
tributed with the same standard deviation. 

We shall find that we are generally required to test considerably 
more than 20 and in general considerably fewer than 600 specimens 
when we use the complete run-down. Despite the fact that we use more 
than 20 specimens in our sample, our test is far more efficient than the 
increased severity test taken at one level alone since we obtain valuable 
information on the distribution of critical stimuli without making 
restrictive assumptions regarding the nature of the distribution. 

Turning now to our first illustration of the theory contained in Note 
3, we consider the following problem: an insecticide is being tested and 
it is desired to know whether or not the insecticide is so effective that 
all but 1 in 1,000 insects will fail to survive 40 milligrams of the in- 
secticide. 
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The test was carried out by testing 20 insects at each of the following 
dosages: 5, 10, 15, 20, 25, 30, and 35 milligrams. The results were as 
follows: 











Dosage Number Number Fraction 
Tested Surviving Surviving 

zie 5 mg. 20 20 pi=1.0 
z?=10 mg. 20 19 pi= .95 
z?=15 mg. 20 18 p= .90 
az‘ =20 mg. 20 14 pi= .70 
az =25 mg. 20 4 p= .20 
x =30 mg. 20 1 pi= .05 


27 =35 mg. 20 0 pi= 0 





If we denote the dosages by (z;), it is convenient to transform the 
set of dosages (z;) into a set (z;’) where 
= 5 


x,’ = ; » 2’ = (0,1, 2,3,---, 6). 





Noting that our formulae for the mean critical stimulus and the 
standard deviation of the critical stimuli involve all the p; exclusive 
of p:=1.0 and p;=0, we find it convenient to tabulate the data as 
follows: 














TABLE 2 
z;’ Pi baé—1 ai PiQi be¢-1p¢ ag — VE pg (b2,¢-1—-2 Spy)? (ba,¢-1—-2Ej)* igs 
1 95 1 .05 .048 .95 —4.6 21.16 1.02 
2 .90 3 -10 .09 2.70 —2.6 6.76 .61 
3 .70 5 .30 21 3.50 — .6 .36 .08 
4 -20 7 .80 .16 1.40 1.4 1.96 31 
5 .05 9 .95 .048 45 3.4 11.56 55 





It will be found that: 
DX pi=2.8; Do bes-1pi=9.0; DY pige=.56; D2 (be,s-1— 22’)? pigs = 2.57. 


We are now in a position to substitute in the appropriate formulae in 
Statistical Note No. 3. The results are: 


=a, + .5+ > p= .54+2.8 =3.3 


)» PiQi 


Ts 1 
9.00 — 7.84 + .03 — .08 = 1.11 
1.05. 





s? = >) be i-1pi — (D> pi)? + + G(us) 


Il 
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Also 
ii .56 
ag = Zea 6 ne 
19 19 
sz’ = .17 
and 
one > (be s—1 — 2z')*p.93 = 138 
19 
. 82 = of 
but 
8, oT ol 
$= 





= = = 18 
2s 2(1.05) ~—-2.10 


Transforming into the original z units it follows that the estimate of 
the mean critical stimulus is equal to 21.50 milligrams. The estimate of 
the standard deviation of the mean critical stimulus is given by (.17) 
(5) =.85 milligram. It can be shown that Z is very nearly normally dis- 
tributed even for small samples® and, therefore, at a 95% confidence 
level, the true mean critical stimulus (u) should be within: 


21.50 + (1.96)(.85) = 21.50 + 1.67 = 19.83 — 23.17 milligrams. 
At a 99% confidence level, u should be within: 
21.50 + (2.58)(.85) = 21.50 + 2.19 = 19.31 — 23.69 milligrams. 


The estimated standard deviation of the critical stimuli in original 
units is given by (1.05) (5.00) = 5.25 milligrams and the standard devia- 
tion of the standard deviation is equal to .90 milligram. Again it is 
possible to set confidence limits on s. 

Let us recall, however, that we want fewer than 1 in 1,000 insects to 
survive a dose of 40 milligrams. In terms of the transformed z’ units 
this means that the 99.9% point should not exceed 7.0. But, it is easily 
seen that 2’+3.09s (the 99.9% point) =3.3+ (3.09) (1.05) = 6.54 with a 
standard error of: 


So9.92 = Sz’2 + (3.09)?s,? = .028 + (9.6)(.0324) = .028 + .311 = .339 
*. 899.9 = .58. 
The probability of getting a sample value of 6.54 from a universe 


13 The fact that 2 and s approach normality rapidly as a function of N, the number tested at each 
stimulus, has been established by the authors. The results have not yet been published. 
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possessing mean 7.0 and standard deviation .58 is found by noting that 
6.54 is .79 standard deviation away from 7.0. There is approximately 
one chance in five of getting the 99.9% point to be 6.54 or less and we 
cannot assert that the insecticide meets the requirement at 40 milli- 
grams on the 5% or even the 10% level of significance. 

It is easy to give an example of the comparison of two lots. The cal- 
culations of the statistics of the lots would, of course, follow the same 
pattern as that described above. For example, suppose that the first 
lot had the characteristics just described and that the second lot pos- 
sessed an average critical stimulus (in transformed units) 9’ =2.7 with 
8 =.15; sy =1.3, s,=.14. 

It is then readily seen that the average critical stimuli differ signifi- 
cantly for the two lots since 

#’ -— 7 3.3 — 2.7 6 
t= = = — = 2.7. 


Veet + sy? V(17)? + (15)? «22 





The probability of getting a difference as large as .6 is less than 1 in 100 
and therefore the lots differ significantly as to their mean critical stim- 
uli. The standard deviations of the critical stimuli do not differ signifi- 
cantly. Computation will also show that the 99.9% points differ sig- 
nificantly on the 5% level of significance. 


VIII. CONCLUSIONS AND PROPOSED INVESTIGATIONS 


These examples are by no means intended to exhaust the possibilities 
of the application of the theory of complete run-down tests as applied 
to sensitivity data. In another paper it will be shown how one can 
make modifications of this approach to the case where the test is not 
a complete run-down and where the data are given over only a partially 
complete set of stimuli. It also will be shown that in some instances 
more precise inferences concerning %, s, and +ks can be made by 
varying the sample size tested at each stimulus. These problems are of 
key importance from a practical standpoint and merit consideration 
in a separate paper. 

In conclusion, the authors wish to emphasize that it has not been 
their object to treat the subject of increased severity tests exhaustively. 
The key purpose of the paper is to bring the various methods into 
bold relief, to compare these methods in some of their aspects, and to 
point out some of the possibilities inherent in the newer methods. The 
authors hope that this paper will serve to stimulate the application of 
modern statistical methods to tests of increased: severity by design 
and research engineers. 
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STATISTICAL NOTE 1 


Stated in statistical terms, the problem is to estimate the universe average yu 
and the standard deviation ¢ (and also their standard errors) of the random 
variable z, which denotes the stimulus required to obtain a critical response of 
any individual drawn from some universe of objects. This problem differs from 
the usual one arising in frequency distributions since the critical stimulus cannot 
be measured for any individual in the universe. This forces us to find estimates of 
nu and o knowing only the estimates p; of the fraction of objects surviving at a 
discrete set of stimuli z;,7=1, 2, ---,m. This approach is essentially the one 
followed in the “probit” method and other modern treatments of the dosage- 
mortality problem and tests of increased severity. 


STATISTICAL NOTE 2 


It can be shown using the first term of the binomial expansion that the chance 
L(N, Q) of finding no defectives in a random sample of size N drawn from a lot 
with true fraction defective Q is given by L(N, Q) =(1—Q)* (where it is assumed 
that non-replacement of an inspected article does not affect the quality of the 
remainder of the lot). The results in Table I may be obtained by letting N = 20 or 
N=100 and assigning various values to L. For example, if N=100 and 
L(N, Q) =.5, then Q =.007. This means that there is a 50-50 chance of accepting 
lots which are .7% defective, if acceptance is based on finding no defects in a 
sample of 100. 


STATISTICAL NOTE 3 


It is assumed in the following that p; is an unbiased estimate of the fraction 
unaffected at stimulus z; and that p; — pi4: is an unbiased estimate of the fraction 
unafiected at stimulus 2;+2;4:/2. In practice it will be found convenient to 
choose intervals 244; —2; whose lengths are approximately given by c. However, 
in some cases good results can be obtained by making the lengths of the intervals 
as large as 2c. For convenience, we shall assume that the interval of test is unity: 
Miu —-xe%=1(t=1, 2, ‘tt, n—1). 

Under the above restrictions we shall show how the average critical stimulus 
(denoted by uz), the standard deviation of the critical stimuli (denoted by co), 
and the higher moments may be easily estimated. We shall also show how the 
sampling errors of the estimates #, s and the higher moments can be found. A 
brief summary of the proofs and results will be given. These formulae depend for 
their validity upon certain assumptions regarding the interval of test and the 
distribution function of the critical stimuli. 

If p; denotes the observed fraction of objects unaffected at x; and p,’ is defined 
as pi’ = pi — Pi: then the key formulae are as follows: 

The estimated mean about the origin is given by: 


we + X% ° 
(= = > (pi — Pisi) (ze + 1/2); 


(1) = > (pe — pir) 
2 t=] 


t—! 
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the estimated gth moment about the origin is: 


(2) me! = > (pe — pers) (ze + 1/2)8, 


t—1 


and the estimated gth moment about the mean is: 


(3) M, = > (xz + .5 — £)*p; . 


t—1 


If we take our assumed origin of coordinates at 2; +.5, then it can be shown by 
simple algebraic manipulation that m,’ may be written simply as: 


n-1l 
(4) m,' = >, pm 


where the summation is taken over all the p;, except p:=1.0 and p,=0. The 
value of taken about 0 as origin is given by: 


n—1 
(5) #=m+.5+ 2 pi 


t—2 


where 2, represents the stimulus for which none of the samples will be affected. 
For the higher moments, it is easy to show that if m,’ is taken about z,+.5 as 
origin, then: 


n~1 
(6) mq’ = >, bes-1D6 


where b,,; represents the tth first difference of the consecutive gth powers of the 
positive integers, i.e., bg,4=i2—(#—1)2. 
For example, if g=2, 


bor = 12? — 0? = 1; be. = 2? — 1? = 3; bog = 8? — (¢ — 1)? =.22 — 1. 


In particular, m2’ =p2+3pst+5pt +--+ +(2n—5) pas. 

Again, the origin of coordinates is taken at 2, +.5. 

Using the well known relationship connecting the gth moment about the mean 
with the moments about 2,+.5 as origin, 


-1 
= cata 4s: 


, ; 
M, = mM,’ — Qm,’mMe-1' + 


(7) 


+ (= Deg — 1)(m')s, 
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it is possible to express m, in terms of the 7. 
In particular for g=2, we have": 


m2 = 8? = m,’ — (m’)? = pr + 3pa + 5m ++°> 


+ (2n — 5)paa — ( > m). 


t=? 


(8) 


It will be noted that we have been able to express the moments about the as- 
sumed origin (z,+.5) as linear functions of the p;. As a direct consequence of 
this result, and also because the p,; are all independent, it follows that the stand- 
ard errors of the various statistics are: 





n~1 nl Me 
(9) Sn: = > Sp = > _ Pid 
: sans * ing %—1 
2 2} 3 2 
(10) Smt = 7 bo.s-18p; 
6n2 


where n;=number of samples tested at the stimulus 2, and 
2 . 2 
(11) in = > Goi ocr Qtg-1)*8p; 


where 
(12) But = bas — Cibg-rst + Cibg-2.682 tree + (— 1)9gZt dy, 


In particular in the case that q=2, it follows that: 


nl 
(13) Pe = sit = : (be,¢-1 —_ 22) spy. 


t—2 
If (13) is combined with the well known result that 


(14) $, = Sm,/2s, it follows that: 





a— 


/ 
_ 
24/F (2 - 3) ps — (D pi)? 


t—2 


t—2 


* {(2¢ — 3) — 2D pa} tap,” 





(15) 8, 





4 This is not, however, an unbiased estimate of o?. It can be shown that (5) is an unbiased estimate 


$ n¢ 


$2 


n-1 n-1 n-1 
. Pi 
of pand that s? = > be, 4-178 -( 5 n) — 7 +G(us), where G(u:) is Sheppard's grouping cor- 
1-2 =2 = 


rection is an unbiased estimate of o*. Formulae (10) and (11) are the standard errors of these unbiased 
estimates. 
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It is often of interest to know not only the standard errors of # and s, but also 
to know the standard errors of +k... Fortunately, thes* of (+s) can beexpressed 
directly in terms of sz? and s,? because of the fact that # and s generally have only 
a small correlation coefficient (<.1). Therefore, treating 2 and s as independent, 
it follows that: 


X (2-3) -2E pes Pt 


(16) var. (tks) =se?+k%s,2— Do OM 4 pe © 


mh a[ a (28-3) pi—-( > p.)*| 


t—2 





STATISTICAL NOTE 4 


The OC curves for the various plans are computed as follows: 

Plans A and B: For any example size n, the probability of acceptance (i.e., the 
probability of obtaining no defects) is P* where P is the true fraction effective, 
i.e., P=1—Q, where Q is the value shown on the abscissa. 

Plan C: In this plan the difference between X; (the specification stimulus) and 
X> (the increased severity stimulus) is 3.29 standard deviations since the stand- 
ard lot is assumed to be one having .05% defectives at X;. Bearing this in mind, 
it is easy to calculate (using the table of areas under the normal curve) the true 
fraction defective Qo at Xo given Q, at X;. For example, if Q; =.0014 for a certain 
lot then Qo =.615. The plan asserts that a lot is to be rejected if more than 60 
defects occur in 100. The probability of accepting the lot with true quality 
Qo =.615 is formed by determining: 


_ 6.60 — Qont/# _ (.60 — .615)(100)'2 
= [Qo(1 — Qo) }*? ma |(.615) (.385) }*/2 = Bi. 








This means that the probability of acceptance is .38. It is to be noted that this 
method of computing the OC curve for Plan C is not generally applicable since 
t will not be normally distributed for all values of Qo. When Qo is less than .10, 
the Poisson distribution represents an accurate estimate. 

Plan D: Again Q, is given. Assuming a normal distribution of critical stimuli, 
we can solve for K in the equation: 


u+ Ko = x, 


where uw and o are the true mean and standard deviation, respectively, of the 
universe. The solution is obtained, as above, from tables of areas of the normal 
curve. A given lot will be acceptable if and only z+5s <2, where 2 and s are 
the sample estimates of u and ¢, respectively. In order to determine the probabil- 
ity of acceptance, we have to determine the probability that we will estimate 
~+5s to be less than or equal to 2, when 4» +Ko =2,. This is derived from: 


xX, — (u + 5c) _ p+ Ko — 4 — 5e (K — 5)o 


T = —____———— = 





SF+50 SP+5 SEi+50 
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For the sample size tested, the estimates of » and ¢ are approximately (and 
asymptotically) normally distributed, and hence so are the estimates of «+50, 
and 7’, if oz,s5. is known and is constant. We assume cz=c,=.1¢, and since 2 
and s are approximately independently distributed, o24s. = +/.260%. Hence, 


(K =5)o K-85 
/.26 515 


T can then be computed for any value of K, i.e., for any value of Q; ,and the re- 
quired probability of acceptance can be determined from normal tables. 

The OC curve for Plan D, showing the probability of acceptance vs. the stimu- 
lus (Xmx.) Where only three defectives in 10 million occur, is easily derived from: 


XxX; sit X maz. 


S2i50 


T = 





, 


the same procedure being used as was used above. 
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FIGURE 2 
Operating Characteristic Curves for Single Stimulus Test 


Without Increased Severity (Plans "4" and "B") 
With Increased Severity (Plan "c") © 


Inspection Plan "4" 





\ Accept if "0" defectives are found in 20; 
otherwise reject. 





Inspection Plan "3" 





\ \ Accept if "0" defectives are found in 100; 


otherwise reject. 





a 
\ \ Inspection Plan “c* 





\s \ Test 100 at increased severity, stimulus 


Xe Such that standard lot has 205% 





failures at specification stimulus. 





Probability of Acceptance 





\ Accept if number of failures at x, 


does not exceed 605 otherwise reject. 





\ \ Critical stimu)i normally distributed, 











ast \ Iw @ constant for all lots. 








(014) 








-3 <2 -1 ce) 
Log of Fraction Defective in Lot 
(.1%) (1%) (10%) (100%) 
FIGURE 3 


Operating Characteristic Curves for Run-Down Test (Plan “D") 
and- Best Single Stimlus Test (Plan "E") 
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FIGURE 4 


Average Sample Size Curve for Single 
Stimulus Sequential Test which Intersects Plan “D" 
at the .95 and .05 Probability Levels 
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BOOK REVIEWS 


Edited by 
OscaR KrisEN Buros 
Rutgers University 


Instalment Mathematics Handbook: With Working Formulas for All Types of 
Transactions. Milan V. Ayres (Consulting Statistician, 5401 Woodlawn Ave., 
Chicago 15, Ill.). New York 10: Ronald Press (15 East 26th St.), 1946. Pp. xvi, 
267. $10.00. 


Review BY Car H. FiscHer 
Associate Professor of Mathematics, University of Michigan 


s 1Ts title implies, this book is intended to furnish “comprehensive and 

detailed information which will enable the user readily to make reliable 
time-payment calculations.” The first of two parts consists of some 121 for- 
mulas and 13 pages of numerical tables. This section contains practically no 
explanation except for some illustrative examples. It is doubtful whether 
the class of users for whom this book was written, that is, “anyone familiar 
merely with arithmetic, simple algebra, and enough plane geometry... ” 
could make use of this portion of the book without having previously gone 
carefully through the second section. 

The second part, entitled “Derivations, Discussion and Proofs,” consti- 
tutes the major portion of the book. Here, most of the formulas listed in Part 
I are developed, usually in considerable detail, and illustrated by numerical 
examples. The author has evidently had a great deal of practical experience 
in the field. This section of the book is liberally sprinkled with references to 
various practices, apparently common in instalment financing, some of 
which would certainly tend to discourage the lay reader from “buying on 
time.” 

There are three principal types of problems considered. The first is an 
internal analysis of finance company operations. Formulas are developed for 
such items as the rate of accumulation and of liquidation of instalment paper, 
an analysis of retail receivables, the collection ratio and life of notes, the 
ratio of borrowings to capital, and the distribution of earnings. These for- 
mulas, in the main, are based upon the arithmetic progression. Much of 
the analysis is quite ingenious although some parts of the exposition are 
not too clear, particularly to one not well versed in the instalment loan 
business. 

The second type of problem given consideration is the determination of 
the size of the monthly payments under various kinds of common instalment 
contracts and interest assumptions. Here one is introduced to the meaning 
of such trade jargon as “bailoon note contracts,” “hold-back deals,” “Mor- 
ris Plan,” “dealers’ packs,” and others. The chief criticism of the multitude 
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of formulas developed is that most are based upon the uniform method 
of determining yield, discussed below. From a practical standpoint, how- 
ever, the formulas are undoubtedly satisfactory. 

The third sort of problem is the determination of the yield to the finance 
company under various kinds of contracts. Six different methods are dis- 
cussed, four of the “summation” type and two of the exponential. The au- 
thor prefers the uniform method, an approximation, under which it is as- 
sumed that each instalment payment is apportioned in exactly the same 
way between reduction of principal and payment of charge or interest. The 
only justification for this procedure is the practical one of simplicity. 
Throughout the entire book, with stated exceptions, this is the interest 
method used. Another simple approximation rule, which is discussed only 
briefly, appears to have more theoretical merit. This is the pro-rata method, 
based upon the assumption that the portion of each payment devoted to the 
charge is proportional to the “total outstandings” at the start of that period, 
that is, to the sum of the outstanding principal plus the unpaid portion of 
the total charge. This deviates only slightly from ordinary compound 
interest. 

It could be pointed out that one of the other summation processes dis- 
cussed, the residuary method, under which it is assumed that all of the early 
payments are applied to reduction of principal and the charge is paid off 
last, produces a yield identical with that obtained by the use of the well- 
known “Merchants’ Rule,” under which the entire principal draws simple 
interest for the entire period; likewise, instalment payments made draw 
simple interest from the date of payment to the end of the period. 

The author is aware that an exponential method is generally considered 
to be the preferred interest procedure but contends, somewhat surprisingly, 
that it is impractical for use in the ordinary finance company office because 
of its complication and because of the unavailability of the necessary interest 
tables. He seems to feel that there are two distinct exponential methods, 
labels one the small loan and the other the present worth method, and devotes 
several pages to an argument endeavoring to show the superiority of the 
latter. Analysis shows, however, that the two methods are identical except 
for the fact that, whereas the small loan method furnishes a nominal interest 
rate, converted monthly, the present worth method goes a step farther and 
produces the equivalent effective rate. 

The author seems to overlook a similar point in chapter 18 where he goes 
into considerable detail in comparing two supposedly unlike methods for 
computing the yield on “hold-back deals.” It is easy to show that his so- 
called “direct” method is merely a special case of the “reserve” method, 
under which the reserve fund earns interest at the yield rate. 

There is rather an extensive chapter, the last one in the book, on inter- 
polation. After a brief discussion of the linear case, the bulk of the chapter 
is devoted to the development of an interpolation method based upon fit- 
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ting certain hyperbolas to three given points and to an attempt to show 
the superiority of this method over the standard finite difference methods. 
As might be expected, the result is decidedly unconvincing; this chapter, 
which might well have been listed as an appendix, is easily the poorest 
portion of the book. 

Such defects as have been mentioned here do not detract seriously from 
the importance of this book in its field. It should be indispensable in any 
finance company Office, particularly if the state of mathematical erudition 
there is as implied by the author. In addition, the book should be considered 
almost equally valuable to teachers of the mathematics of finance and to 
authors of textbooks in the subject. After reading this handbook, the re- 
viewer is impressed with the inadequacy of the treatment of instalment 
financing in the leading texts in the mathematics of finance. Even authors 
of textbooks in college algebra might find in this handbook some interesting 
and practical applications of arithmetic progressions with which they could 
enliven the topic, replacing the ever-present problems on the number of logs 
in a pile and the distance run in a potato race. 


Mathematical Tables and Other Aids to Computation. Published quarterly be- 
ginning January 1943 by the National Research Council. (Number 15, July 1946 
was the last issue available at the time of writing this review.) Beginning with 
1947 the subscription price for each calendar year is $4.00, payable in advance; 
ordinary single numbers, $1.25. Earlier ordinary single numbers, each $1.00, 
and all numbers for each of the years 1943 to 1946 inclusive, $3.00. Special 
single number 7, Guide to Tables of Bessel Functions, $1.75, and number 12, 
$1.50. All payments are to be made to National Academy of Sciences, 2101 
Constitution Avenue, Washington, D. C. Edited on behalf of the Committee on 
Mathematical Tables and Other Aids to Computation by Raymond Clare Archi- 
bald (Professor Emeritus, Brown University) and Derrick Henry Lehmer with the 
cooperation of Leslie John Comrie and Solomon Achillovich Joffe. 


Review BY KENNETH J. ARNOLD 
Assistant Professor of Mathematics, University of Wisconsin 


a use a wide variety of mathematical tables and computa- 
tional procedures, some of which have been developed in statistics and 
others of which have been developed in astronomy, psychology, business, 
engineering, and various other fields. Even such statistical tables as distribu- 
tion functions, percentage points, and constants of sampling distributions 
sometimes appear as appendices to articles devoted to particular applica- 
tions and in journals not usually perused by statisticians. Frequently the 
general utility of such tables is disguised by headings in terms of the par- 
ticular application as “No. of animals which die” for the upper limit in a 
binomial summation. Collections of tables such as Pearson’s Tables for 
Statisticians and Biometricians and Fisher and Yates’ Statistical Tables for 
Biological, Agricultural and Medical Research cannot obviate the necessity 
of reference to journals for tables needed in the solution of some of the 
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most frequently recurring problems. The continual appearance of new 
tables and new computational methods and the consequent increase in the 
problem of finding whether and where appropriate tables and methods are 
available are phenomena common to all fields in which computers and table 
users work. A journal devoted to the problem became a necessity and in 
January 1943 the first issue of Mathematical Tables and Other Aids to Com- 
putation appeared. The role of MTAC was described by Professor Archibald 
in the opening sentences of this first issue: 


This Quarterly Journal, a new publication of the National Research Council, 
is to serve as a clearing-house for information concerning mathematical 
tables and other aids to computation. Especially during the past decade 
have tools for computation been vastly multiplied. These tools, or accounts 
of them, are to be found in an enormous international range of book, pam- 
phlet, and periodical publication, not only in the fields of Pure Mathematics, 
Physics, Statistics, Astronomy, and Navigation, but also in such fields as 
Chemistry, Engineering, Geodesy, Geology, Physiology, Economics, and 
Psychology. An attempt will here be made to guide varied types of inquirers 
to such material. 


A summary of the contents of the first 15 numbers gives an assuring pic- 
ture of the manner in which MTAC is fulfilling its purpose. The numbers 
1 to 12, constituting Volume I, total 480 pages; the numbers 13 to 15 total 
148 pages. Each ordinary number contains an article or articles on tables, 
computing machines, or computational methods. These articles are fol- 
lowed by sections entitled “Recent Mathematical Tables,” “Mathematical 
Tables—Errata,” “Unpublished Mathematical Tables,” “Mechanical Aids 
to Computation,” “Notes,” “Queries—Replies,” and “Corrigenda et Ad- 
denda.” Special number 7 is a 104-page “Guide to Tables of Bessel Func- 
tions” by H. Bateman and R. C. Archibald. Special number 12 contains, 
in addition to the contents of an ordinary number, an index to Volume I. 

Among the articles in numbers 1 to 6 and 8 to 15 are “Machines for 
Solving Algebraic Equations” by J. S. Frame and “Scientific Computing in 
Great Britain” by J. R. Womersley. 

Under “Recent Mathematical Tables,” 241 reviews have appeared. With 
each is associated a code letter or letters indicating the class of function or 
the field of application; as examples: A for arithmetical tables and con- 
stants, C for logarithms, H for numerical solution of equations, J for finite 
differences and interpolation, K for statistics, N for interest and investment, 
O for actuarial science, Z for calculating machines and mechanical computa- 
tion. Of the 241 reviews, 23 have associated K’s. Each review contains 
definitions of the functions tabulated, the range of arguments, the number 
of decimals or significant figures in the entries. Most reviews also discuss 
accuracy, errata, related tables and applications. The thoroughness of the 
reviews is indicated by the fact that W. G. Cochran’s review of the second 
edition of Fisher and Yates’ Statistical Tables occupies 4 pages, roughly 
equivalent to 6 pages in this review section. 

Under “Mathematical Tables—Errata,” 86 lists have appeared. There is 
no rejoinder to the comment made by R. C. Archibald in presenting errata 
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in several statistics books: “Authors of frequently used works in the field 
of Statistics display some carelessness in the preparation of tables they 
publish.” 

Under “Unpublished Mathematical Tables,” 49 entries have appeared. 
Some of these tables have thus been available several years before publica- 
tion; others were never intended for publication but have been used in the 
construction of published tables. Under the next four headings the follow- 
ing numbers of entries have appeared: “Mechanical Aids to Computation,” 
22; “Notes,” 58; “Queries,” 18; “Queries—Replies,” 24. 

Cross references are quite adequate. In each entry, reference is made to 
earlier or concurrent related entries. In each section, reference is made to 
material appropriate to that section contained in entries in other sections; 
for example, the reviews of recent tables which contain lists of errata are 
mentioned under the heading “Mathematical Tables—Errata.” The follow- 
ing two adverse criticisms will probably be invalidated in a short time. The 
over-long delay in the publication of reviews of some recent statistical tables 
can perhaps to a large extent be attributed to the delinquency of reviewers. 
While the Index to Volume I is inadequate as an aid to discovering tables 
of a particular function, Professor Archibald informs me that a detailed 
“Subject Index” covering the whole of Volumes I and II is planned for 
inclusion in the Index to Volume II which will probably appear in October 
1947. 

It can be hoped that a guide to mathematical statistical tables will 
appear in the not too distant future but a guide cannot replace the listing 
of errata and the careful reviews of MTAC. 


Industrial Experimentation [Revised Edition]. K. A. Brownlee (Research De- 
partment, Distillers Co., Ltd., Great Burgh, Epsom, Surrey, England). A re- 
vision of the 1945 memorandum of same title reviewed in this JourNaL, March 
1946. Directorate of Ordnance Factories (Explosives), Ministry of Supply. 
London, W.C. 2: H. M. Stationery Office, 1946. Pp. 116. 2s. Paper. (New York 
20: British Information Services (30 Rockefeller Plaza), 1946. $0.60.) T'wo re- 
views follow: 


Review By GeorGE W. Brown 
Research Associate Professor 
Statistical Laboratory, Iowa State College 


HE revised edition is generally of the same character as, and in most re- 
Tao identical with, the first edition. It has, however, been somewhat 
expanded in certain respects, notably with respect to multiple correlation 
and analysis of variance. Other additions, mostly in the form of explanatory 
remarks and cautions, contribute to a general improvement. 

This reviewer is in substantial agreement with most of the comments, both 
favorable and adverse, made by Tukey and Wolfowitz in their reviews of the 
first edition. In fairness to the author, it should be pointed out that the 
present edition had passed galley proof stage before Mr. Brownlee had an 
opportunity to read the reviews in question. 
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This manual is, to use Professor Tukey’s language, “a very good cookbook” 
of statistical methodology, and could, with more careful attention devoted 
to the basic hypotheses underlying statistical methodology, evolve into a 
manual of considerable stature in the field. 

The major change in this edition consists of a reshuffling and expansion of 
the analysis of variance discussion, with greater attention devoted to back- 
ground material and consideration of examples of multi-factor experiments 
with various orders of replication. Among the miscellaneous additions are a 
short section on the Poisson distribution, more introductory material on 
quality control, and inclusion of the range-standard deviation conversion, 
the Fisher inversion procedure in multiple regression, and properly discour- 
aging remarks with respect to extrapolation outside the range of inde- 
pendent variables in regression work. 

A few important corrections have been made, including a note on the two- 
tailed F test and a paragraph designed to remove the false impression that 
multiple correlation technique does not permit the study of interactions. In 
discussing an example of the significance of a mean, the author points out 
that sigma is an estimate; hence, the determination made on few degrees of 
freedom is less accurate than the determination made on a large number of 
degrees of freedom; he points out, moreover, that the “t” distribution takes 
this into account. This point, not made in the first edition, may remove some 
of the confusion resulting from the apparent identification of population 
variance and its sample estimate. 

An added remark on page 26 seems to imply that, in testing homogeneity 
of variances, if the extreme estimates did not differ significantly by F, then 
Bartlett’s test would not show significance. It seems to this reviewer that the 
remark in question should have been accompanied by a certain amount of 
caution. 

The recommended procedure, given on pages 21 to 23, for comparing the 
means of two samples, depends on the sum of the sample sizes. It is not clear 
whether the author realizes that the following procedures are really different, 
but he suggests, without clarification, that if n1+n.<30 the variances 
be treated as homogeneous; that the Behrens-Fisher test be used when 
m +nz >30. The important point is that from the very outset no mention is 
ever made of the importance of background hypotheses and parallel informa- 
tion. 

A multiple regression example presented in chapter 10 has independent 
variables 2», 2a, Ts. The regression on 2, alone is significant at the one percent 
level, but the effect of z, is not significant when z, and z, are included. In 
discussing this situation on page 63, Mr. Brownlee says, in referring to the 
regression on 2, alone, “z, would appear significant at the 1% level, an en- 
tirely erroneous conclusion.” Why call this erroneous? The author is cer- 
tainly aware that the regression on x, and/or 2, might turn out not significant 
when adjoined to z,. In any case, the very important question of the order of 
elimination of independent variables, or the corresponding question in 


























































2S RATE IO a Sete 


extn a cae 





\TION 


200k” 
voted 
nto a 


ion of 
back- 
nents 
are a 
al on 
‘sion, 
cour- 
inde- 


two- 
that 
s. In 
3 out 
es of 
er of 
akes 
ome 
ition 


1eity 
then 
t the 
it of 


the 
lear 
ent, 
nees 
hen 
in is 
ma- 


lent 
ent 
In 
the 
en- 
cer- 
ant 
r of 

in 




















BOOK REVIEWS 597 


analysis of variance problems, seems nowhere to be mentioned. This is not 
a question of mathematics, it is an operational question whose solution is 
dictated by consideration of the problem at hand. Similarly, when two inde- 
pendent variables are correlated, the author says that the individual regres- 
sion coefficients are fictitiously large. This would seem to imply that it would 
be incorrect to use an individual regression coefficient if either one of the 
independent variables is used alone. Again, the reviewer is certain that what 
was meant was that if two independent variables are to be used jointly, the 
regression analysis should be the joint multiple analysis. 

On page 107, it is implied that the analysis of variance is impossible when 
the data are haphazard or when units are missing. This is not strictly true, 
although the analysis in such cases may be extremely difficult because of the 
large number of linear simultaneous equations which may have to be solved. 
On page 108, it is pointed out that the computation for a multiple correlation, 
with say 5 independent variables, is much more severe than for a five factor 
analysis of variance. It might be noted that the analysis of the conventional 
design has been simplified by the introduction of various orthogonalities; 
similar orthogonalities in a multiple correlation analysis would simplify the 
analysis correspondingly. When hypotheses are properly formulated in the 
analysis of variance situation and in the multiple correlation situation, the 
computations coincide as well as the significance tests. The major difference 
between the two situations is primarily with respect to the way parametric 
hypotheses are stated. 

On page 31, it is remarked, in connection with 2 X2 contingency tables, 
that in borderline cases the correction for continuity is important. Since the 
chi-square distribution is still an approximation, in this case, with or without 
the correction for continuity, and since the statistician rarely has adequate 
quantitative basis to set precise significance levels, it would appear to this re- 
viewer that no really important decision should rest on the correction for 
continuity. 

This reviewer feels that the value of this pamphlet could still be increased 
by further revision to include expanded background exposition of the 
hypotheses behind the statistical tests, discussion of “practical” significance 
and perhaps points such as those raised above, where further clarification 
might be useful as well as informative. Nevertheless, the pamphlet in its 
present form is potentially useful to a large class of people and contains a 
considerable amount of material for the rather low price of sixty cents. 


Review By ALAN E. TRELOAR 
Associate Professor of Biostatistics, University of Minnesota 


ewe interesting manual of statistical methods, written primarily as a 
guide for industrial research workers who are not acquainted with the 
procedures or philosophy of statistics, has been considerably extended in the 
present revision. It is now referred to in the preface as a “monograph” in- 
stead of a “report,” and the author’s name is appropriately displayed on the 
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title page instead of being given merely in an acknowledgment line of the 
foreword as was previously the case. 

The textual material of the earlier edition has been amended somewhat 
through incorporation of some additional explanatory sentences, alterna- 
tive procedures and extra examples, all of which are valuable. However, the 
principal changes involve (1) reorganization and extension of the discussion 
of analysis of variance in relation to factorial experimentation, (2) illustra- 
tion of logarithmic transformation to reduce anormality and permit rou- 
tine handling of proportional change situations, and (3) inclusion of a short 
chapter on the Poisson distribution. The 27 extra pages of text, together 
with removal of line spacers at many points (without much loss in reading 
ease), have increased the volume of discussion by perhaps 40 per cent. 

The author had not seen the reviews of the first edition given in this 
JOURNAL when the revision went to press. In a communication to the Re- 
view Editor he refers to this as “unfortunate.” The present commentator 
agrees that this is so; for had adjustments been made to eliminate the de- 
fects discussed in those excellent reviews, the present edition would have 
had considerably enhanced value. 

The error of using the F test when |F'| is called for has been corrected in 
this revision, although the approach is somewhat clumsy because reference 
to the correct answer is given as an afterthought to use of the inappropriate 
one. It is also stated without reservation that the probability corresponding 
to any given |F'| is double that for F, but this is true only when the samples 
are of the same frequency. The other defects discussed by Professors Tukey 
and Wolfowitz remain. It may be added that correction for continuity is 
considered of great importance in 2 X2 tables but the subject is not men- 
tioned in connection with a 3X1 table where each of the expected fre- 
quencies is only 5. Also a reversed code scale in the example of simple cor- 
relation may cause trouble for the inexperienced reader, for the sign of the 
coefficient is reversed without explanation. 

The reiterated conclusion that an insignificant F means that an initially 
assumed component of the numerator variance does not exist is most dis- 
turbing to this reviewer. Perhaps the author is uneasy about this point when 
he concedes that values of F somewhat below the accepted critical level for 
a claim of significance suggest situations worthy of further investigation. 
However, he follows the beaten path in usage of the 5 per cent level rather 
assiduously in the actual analyses. 

On page 13 one finds the following statement, which is not uncommon in 
statistical texts: “For many purposes the 5 per cent level is accepted, but we 
must realize that this means that 1 in every 20 times we will assert that 
an effect exists when it really does not.” The assumption implicit here that 
all responses are in fact expressions of chance variation is a strange one, and 
quite out of line with much of the argument in this book. The maximum 
proportion of errors “of the first kind” corresponds to the significance level 
adhered to; usually, the proportion may be expected to be very much less. 
While the author seems aware of the dangers of assuming that actual data 
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conform in structure to the mathematical models in terms of which the 
statistical procedures have been elaborated, he does not discuss the assump- 
tions intrinsic to the models at all adequately. Indeed at one point (p. 108) 
he goes so far as to say: “we can disregard the underlying assumptions [of 
the analysis of variance] with comparative impunity.” Many will challenge 
this point. 

This book will prove useful to teachers of statistics, despite its failure to 
maintain desirable technical standards in the respects mentioned. It broad- 
ens the horizon with applications in a field where before there was but little 
illustrative material. Although the theme is pursued with missionary zeal, 
unusually good balance with practical considerations is preserved. Provided 
that the reader is aware of the deficiencies discussed in these reviews, the 
presentation should prove very helpful as accessory reading in connection 
with statistical courses. 

Measuring Business Cycles. Arthur F. Burns (Director of Research; Professor 
of Economics) and Wesley C. Mitchell (Director of Research, 1920-1946; Pro- 
fessor Emeritus of Economics). (National Bureau of Economic Research; Colum- 
bia University). Studies in Business Cycles, No. 2. New York 23: National 
Bureau of Economic Research, Inc. (1819 Broadway), 1946. Pp. xxvii, 560. 
$5.00. 

REVIEW BY ELMER C. Bratr 
Professor of Economics, Lehigh University 


HE publication of Number 2 of the National Bureau’s Studies in Business 

Cycles is a notable event. This volume is faithful to the general pattern of 
study outlined in 1927 when Number 1 was published but has appeared later 
than then anticipated. In the meantime the program as a whole has taken 
shape. Number 2 highlights the methods employed and evaluates their ef- 
fectiveness. It is to be followed by a series of monographs applying ond 
techniques here established to separate economic processes. A final volume 
is projected to weave the results together into a theoretical account on how 
business cycles run their course, although a summary preview is now 
promised for the near future. 

The methods of analyzing the business cycle are developed to test the 
business-cycle hypothesis: recurrent expansions and contractions occurring 
at about the same time in many economic activities. The first step is to adjust 
seasonally the original series. Turning points are then located by judicious 
employment of a set of mechanical rules, establishing separate cycles in each 
series. Each of these cycles is then represented in percentage variations from 
its average value. “Specific Cycles” are thus obtained, correction having 
been made for seasonal and inter-cycle trend. 

The specific cycles contrast with the ‘Reference Cycles,” computed simi- 
larly for each series. The reference cycles cover the periods of general busi- 
ness cycles. Turning points in general business cycles represent the consensus 
of a collection of time series rather than points in a single aggregate index. 
Both the specific and reference cycles, where the data are monthly, are 
divided into nine stages, including the initial and terminal trough, the peak, 
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and three approximately equal periods each for expansion and contraction. 
The levels at the troughs and peak ordinarily are represented by a three 
month average. A series of tables for each series give a wide range of informa- 
tion, including the levels and duration of various stages, rates of change be- 
tween stages, and inter-cycle changes. One set of tables is made for specific 
cycles and another for reference cycles. An additional set of tables is de- 
veloped to show the conformity of each series to general business cycles. In 
each work table, values are given for the separate cycles and for the arith- 
metic mean and average deviation. By mid-1942 over a thousand monthly 
and quarterly series had been so analyzed, principally drawn from the 
United States, but over 200 from the three principal European countries. In 
addition, over 200 yearly series have been analyzed, a third of them from 
Europe where monthly data are less available. The yearly data are found to be 
a poor substitute, however, because major cycle changes occur within a year. 

But little of this work-table information could be presented in the present 
volume. After providing a full description of the methods, attention is turned 
to issues regarding the effectiveness of them. First, consideration is given to 
the retention of the intra-cycle trend. Conviction here resulted from the fact 
that the trend line determined depends upon subjective factors together with 
a belief that the trend must be retained to represent the “cycle of experience” 
as the unit of analysis. Trend adjustment in six series is found to reduce cycli- 
cal variability and makes the cycles in different processes more comparable. 
Supplementary trend-adjusted data are not employed for all series simply 
because of the work and expense. 

The need for smoothing out random influences initially is checked by con- 
trasting results obtained from smoothing four actual and six artificial series, 
principally by Macaulay’s “43 term summation approximately fifth-degree 
parabolic graduation.” It is concluded that initial smoothing would eliminate 
part of the actual cyclical movement but not all of the random influence. 

Major attention is given to the significance of the average cycles computed. 
Since cycles are thought of as unique experiences by the National Bureau, 
the employment of averages presents difficult problems. Should cycles be 
grouped secularly or according to some scheme of long cycles before averag- 
ing? Little relationship is found between order in time and duration and 
amplitude of 7 test series; secular changes in durations and in amplitudes do 
not appear to be significant. Checks on the variation in character of business 
cycles presupposed by the most acceptable long-cycle hypotheses reveal no 
significant grouping. 

Measurements presented in the volume move in almost all directions. No 
measurement is considered final; all are viewed as steps in a series of suc- 
cessive approximations. The American monthly reference dates through 
1927, for instance, have not been revised since 1929, although some are 
definitely known to require revision. The conclusions cannot be applied as 
generally as many of us would like; they apply rather narrowly to the 
significance of the National Bureau methods. 
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‘action. Do the benchmarks set out provide appropriate background for the forth- 
a three coming analyses of various economic processes? The arithmetic mean may be 
nforma- called into question. Although the authors are convinced that the median and 
ange be- positional means are less satisfactory, the possibility of the geometric mean 
specific is not considered. The prevailing positive skewness of cycle-length distribu- 
s is de- tions appears to the reviewer to involve more than the artificiality introduced 
cles. In by setting a minimum but no effective maximum length for the cycle. 

e arith- The rejection of positional means in averaging cycles and acceptance of 
nonthly them in measuring seasonals may indicate that greater importance is at- 
om the tached to extreme values in the former case. Separate averages for major 
tries. In and minor cycles certainly would provide better clusters, and the reviewer 
m from believes greater homogeneity. Since the series employed do not represent 
nd to be processes, the amplitudes studied are of doubtful significance. Recognizing 
a year. the inadequacies of all measurable amplitudes, the reviewer believes never- 
present theless that there is a clear difference in amplitude of industrial activity since 
; turned 1878 between commonly recognized major cycles and others. The authors’ 
riven to data on length of expansions of deflated clearings show only three of seven 


the fact rising from major depressions lasting less than thirty months and only three 


er with of eight rising from minor depressions lasting more than thirty months. Fur- 
wtenen ther, the authors note “the degree of cyclical diffusion is correlated with the 
0 eveli- amplitude of cyclical fluctuations.” 
iain Most of the cycle measures would be modified by displacement of the turn- 
simply | ing points employed. If the rules employed for determining these turning 
points are accepted, the question should be raised whether the measures de- 
by con- veloped overemphasize them. Although the turning points are dated at thell 
1 series, end of a flat top or bottom, there are many cases where the date is several 
-degree months before an expansion or contraction was recognized at the time. 
lsteuie Since most series show interruptions of expansions and contractions while 
fluence. they are under way, why does the terminal high or low point have major 
nputed. significance? Notably, the timing measures, which are developed froma 
Bureau, comparison of these turning points instead of from the unsatisfactory, but 
reles be widely used, correlation method, are critically dependent upon this dating 
averag- scheme. 
aes eel Does the holding that the cycle of experience includes the intra-cycle 
ndes de trend, but excludes seasonal, represent a Schumpeter position to the effect 
alae that the trend and cycle are inseparable? Or does it result from statistical 
weal a0 convenience? Many analysts will question the authors’ position that the 
business community fully allows for seasonal and not at all for growth and 
eo We decadence. It is one thing to recognize the difficulties of trend adjustment and 
of anc another to rationalize that the trend is a part of the realistic cyclical | 
hrough ence. Does not the latter position prejudge the case to an extent that the 
me ore character of cycle to be ultimately determined is partly postulated at the 
plied as start? 
to the The reviewer believes that per-month averages are overused. Which is a} 


worse depression: one in which the contraction shows a violent and rapid } 
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decline and therefore a large per-month average drop or one which shows a 
slower decline to the same depth? 

The reviewer agrees that the frequent relaxation of rules, the making of 
separate judgments at successive stages of inelegant analyses, does not in 
this case represent a method inferior to more objective formulations. The de- 
gree of elegance we can afford depends upon the kind of problem faced. Care- 
ful study of the volume will give the student added respect for the impor- 
tance of the business cycle as a distinct type of fluctuation. 

Twenty years ago Mitchell’s analysis was to be compared only with a mul- 
titude of competing theories, none of which had been adequately checked 
with measurable occurrence. The National Bureau methods are less pre- 
eminent today. Econometric analysis has proceeded a long way and has a 
substantial following. Concomitant-cycle analysis has a wide group of ad- 
herents. And the present renown of the Keynesian aggregative consumption- 
income theory puts it out of reach of earlier theories. In the present volume 
the National Bureau faces only the second of these contending frameworks, 
and in this case from the viewpoint of whether the bench mark averages are 
distorted by ignoring long cycles. Respect for the work of econometrists em- 
ploying annual data is weakened, however, by the demonstrated inadequacy 
of annual data in studying business cycles. Although the volume does not 
materially reduce the area of disagreement, it represents only the initial 
evidence. We cannot expect the National Bureau to come to grips with 
competing frameworks until the processes are examined. 


How to Read Statistics. R. L. C. Butsch (Formerly Professor of Education, 
Marquette University). Milwaukee 1, Wis.: Bruce Publishing Co. (524-544 N. 
Milwaukee St.), 1946. Pp. v, 184. $2.50. Two reviews follow: 
Review By Louis GUTTMAN 
Associate Professor of Sociology, Cornell University 


N EASY path to climb towards a serviceable knowledge of statistical 
theory is perennially being sought. The book now being proferred by 
Dr. Butsch is another attempt to convey statistical understanding to per- 
sons without any mathematical background. He addresses himself to the 
“large number of teachers, social workers, personnel directors, and industrial 
executives, who through lack of opportunity or inclination have failed to 
acquire the minimum of statistical techniques,” that is, who have not ever 
been trained to apply a formula even in a purely mechanical fashion. The 
emphasis is to be on the “why” and not on the “how”; not a single formula 
appears in the book (albeit a few standard algebraic symbols are used). 
The subject matter covered ranges from graphic presentation and aver- 
ages through partial and multiple correlation, sampling errors, and analysis 
of variance and covariance. The examples used are from educational 
psychology. 
The presentation is simple and lucid, on the whole. Indeed, it is one of the 
best-written statistical books this reviewer has seen. An unresolved problem 
in this reviewer’s mind is: How long will the material remain with the reader? 
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Can a person sufficiently retain and appreciate the ideas that are laid out 
before him if he is given no practice in computing, not to speak of deriving 
the formulas that are the consequences of the ideas? This is a problem that 
well might be investigated experimentally by educational psychologists. In 
any event, Dr. Butsch’s book can certainly provide a useful supplementary 
discussion to a more complete treatment, no matter how sufficient it is by 
itself for its intended audience. 

A number of faulty statements that should be pointed out include the 
following: The author states that statistical treatment is required by sci- 
ences dealing with living things, but not by the physical sciences (p. 2); 
that dispersions of distributions cannot be compared unless they have the 
same average (p. 28); that standard scores are more “strictly mathematical” 
than percentile ranks (p. 73); that a correlation ratio cannot be computed 
for a quantitative variable from a qualitative variable (pp. 102-104); that 
probability statements about sample statistics can also be properly inter- 
preted as probability statements about population parameters (p. 143); that 
probable errors should be used for standard deviations and correlation co- 
efficients (pp. 145-146); that the t-test should be used to test whether or not a 
sample could have been drawn at random from a known population (where 
the population variance can be known as well as the population mean) (p. 
157); and that in most instances the number of degrees of freedom for ¢ is 
just one less than the size of sample (p. 159). The biserial and tetrachoric cor- 
relation coefficients are introduced as new measures of correlation rather 
than as equivalent to the product-moment coefficient; their underlying as- 
sumptions are not stated; and indeed the examples used do not fulfill the 
assumptions (pp. 108-110). The connection between partial correlation and 
prediction is misconstrued by introducing a vague notion of a “real” cor- 
relation instead of using the definition of correlation used earlier (pp. 127- 
128). In general, the assumptions behind the various significance tests are 
not given, especially those of analysis of variance and covariance, although 
—in contrast to other psychologists—the author introduces the normal dis- 
tribution where it belongs, namely in connection with sampling problems. 

Deficienci*s noted in the discussion of graphic presentation include the 
following: Grid lines are drawn through the bars of a histogram (p. 9); in 
another there is a mixing of light and heavy lines for bar outlines (p. 16). 
The impression is given that the median is graphically represented by a line 
segment, whereas the mean is shown as a point (p. 16). Pie charts are en- 
dorsed liberally for complicated comparisons (pp. 43-44). Vertical “fre- 
quency” scales are retained for a frequency polygon and a smoothed curve, 
instead of using an area key (pp. 63-64). 


Review BY HELEN M. WALKER 
Professor of Education 
Teachers College, Columbia University 
HE attempt to explain the meaning of commonly encountered statistical 
terms without using any formulas whatever and using almost no compu- 
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tations is a task requiring considerable courage as well as great skill in ex- 
position. This book has been more nearly successful than the reviewer antici- 
pated it could be, knowing only too well the difficulties to be encountered. 
The language is clear and simple; the illustrations are well chosen and 
within the understanding of the sort of person likely to read the book; 
almost all of the terms ordinarily used in non-theoretical studies are in- 
cluded although the coverage is more complete for education than for 
biology, business, or economics; no previous preparation is presumed except 
an alert and inquiring mind. 

The first major problem to be faced by the writer of such material is the 
choice of topics. Shall he treat the topics the reader is likely to come upon in 
published research or only those topics which he considers appropriate to 
use? If he omits certain outmoded statistics, the person who reads his book 
in the hope of finding enlightenment on the meaning of published research is 
doomed to disappointment. Dr. Butsch apparently decided to include in the 
first part of his book (and to treat with respect) statistics which are far less 
efficient and more ambiguous than other more modern statistics which he 
treats in later chapters. The reviewer would question this decision. 

The topics referred to in this text include: frequency distribution (6 pp.), 
central tendency (8 pp.), graphic representation (38 pp.), all of which can be 
presented easily on a non-technical level and have been so treated in a num- 
ber of introductory texts; measures of dispersion (10 pp.), including quartile 
deviation, mean deviation, standard deviation, with two computations of 
each from easy data; the transformation of raw scores into percentile scores 
and standard scores (12 pp.); simple correlation and regression (32 pp.), 
including brief mention of the correlation ratio, bi-serial correlation, the co- 
efficient of contingency, and tetrachoric r; multiple regression and multiple 
and partial correlation (21 pp.); a chapter called “Measures of Reliability” 
(15 pp.), which brings in such terms as sample and population, statistic 
and parameter, random sample, sampling error, sampling distribution, 
standard error, “determining the limits within which a parameter will prob- 
ably be found,” probable error; a chapter on “Measures of Significance” (17 
pp.), which brings in such terms as the probable error of a difference between 
means, critical ratio (i.e. a difference divided by its probable error), the t-test, 
degrees of freedom (on which a fair amount of light is shed in less than‘a 
page), the null hypothesis, and levels of significance; and a second chapter on 
measures of significance, including the chi-square test (5 pp.), treated with- 
out reference to the previous mention of the coefficient of contingency and 
with an example using a 2 X2 table of only 40 cases in which Yates’ correction 
would have made considerable difference had it been applied, analysis of 
variance (7 pp.), and analysis of covariance (6 pp.). 

After the topics to be treated have been determined upon, it is indeed a 
difficult task to make them take on meaning without either computational 
exercises or mathematical derivations. In fact, it is practically impossible to 
be intelligible on an elementary level and also completely correct if one is 
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dealing with topics beyond the simplest ones. If the writer of such a text is 
primarily concerned with making statements which cannot be misconstrued 
and which cannot be called in question by the person who knows his theory, 
he will hedge his explanations with so many conditions and qualifications 
that the reader is left in a haze. Therefore it seems pedantic to point out in- 
correct statements and inconsistencies in a book which on the whole creates 
a very favorable impression, but nevertheless it seems necessary to mention 
a few such. 

There are about 16 pages devoted to the probable error written in the 
manner of 20 years ago, even including such a statement as the following: 
“Thus, if a correlation coefficient for a particular sample is reported as 
.60 +.02 we know that the chances are even that the correlation for the total 
population will not be less than .58 or more than .62; and the chances are 
very remote that it might be as small as .52 or as large as .68.” After reading 
this, it is a real surprise to come upon a rather good elementary treatment of 
t. The spirit and even the phraseology of these two sections are so dissimilar 
that it is hard to understand how the author who wrote one of them could 
also have written the other. The relation between the critical ratio and the 
t-test is left in some confusion. The reviewer feels that the extended treat- 
ment of probable error with all the outmoded phraseology of interpretation 
is unfortunate and that it would have been better to make the whole explana- 
tion in terms of standard error with a sentence or two explaining the relation 
between standard and probable error and a statement that the latter could 
have meaning only for a statistic that was normally distributed. The implica- 
tion of this section and the suggestion on page 147 that the probable error 
can be applied to a difference between percentages, standard deviations, or 
coefficients of correlation seems to suggest that all sampling distributions are 
normal although that incorrect statement is never made. The term, “sigma,” 
is used throughout instead of the more straightforward “standard deviation.” 
Why authors do this, substituting the name of a Greek letter (which is no 
longer widely used as the symbol for the standard deviation of a sample) 
for a perfectly good noun, the reviewer never can understand. The treat- 
ment of partial correlation is particularly happy though it might have been 
made still better by the use of the concept of residual error. The illustration 
of the t-test on page 157 seems to suffer from errors in printing and will be 
most confusing; it refers to ten scores but prints only nine. However, the 
mean as stated is not the mean of this nine. Even by adding a tenth score, 
which would produce the mean quoted, it is impossible to secure the value 
of t quoted, and the reviewer cannot decide what computations were actually 
made. Certain topics such as the correlation ratio, bi-serial r, and tetrachoric 
correlation are treated so briefly that one wonders if their introduction will 
not be confusing. The book might have been made more useful by the in- 
clusion of well selected annotated references to more extended treatments 
available elsewhere. 

The person making his first acquaintance with statistical method will gain 
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much by a cursory reading of this book. It can be recommended to one who 
wants only to know the general spirit of statistical investigation, and that 
is the person for whom the book was written. If subsequently this person 
undertakes to use the intentionally brief discussions of this book as a guide 
for writing an interpretation of data, if he tries to press the meaning of in- 
dividual sentences too far, he will be trying to make the book serve a purpose 
which it cannot fulfill and which it was never intended to fulfill. The writer 
has designed the book for “a large number of teachers, social workers, person- 
nel directors, and industrial executives, who through lack of opportunity 
or inclination have failed to acquire a minimum of statistical techniques”; 
many of whom, however, “should be able to read the articles employing 
statistical procedures and should at least be able to understand the purpose 
and the results of the techniques employed.” For such persons this book 
should be useful. 


Rudimentary Mathematics for Economists and Statisticians. W. L. Crum (Pro- 
fessor of Economics) and Joseph A. Schumpeter (Professor of Economics). (Har- 
vard University.) A revision of Supplement to Quarterly Journal of Economics, 
Vol. 52, May 1938. New York 18: McGraw-Hill Book Co., Inc. (230 West 42nd 
St.), 1946. Pp. xi, 183. $2.50. Two reviews follow: 


REvIEwW By J. E. Morton 
Cornell University 


ERHAPS the most outstanding feature of the book is the pedagogical skill 

with which the student is introduced to and led through the field of 
calculus. The ease of exposition, and the smooth and well-integrated de- 
velopment of the idea of rates of change and their systematic treatment, 
result in a volume which places all teachers considerably in the debt of the 
authors. From this point of view the text compares favorably with such 
simple and successful introductions into the pattern of thought of the 
calculus as DeMorgan’s and Irving Fisher’s. 

The present volume covers very much the same ground as the well-known 
original, which was published as a supplement to the Quarterly Journal of 
Economics in March 1938. The new material includes paragraphs on Tay- 
lor’s series, on homogeneous production functions, on Lagrange multipliers, 
on the concept of the integral, and an enlargement of the treatment of 
higher order differential equations. A brief discussion of determinants is 
added in a new chapter. The book should make easy and pleasant reading 
for any student who still remembers the essence of his high school algebra 
course. 

The less satisfactory aspects of the book are the absence of bibliographical 
material and the absence of problems and examples, which some instructors 
or readers may deplore. The book addresses itself primarily to the future 
economist rather than to the statistician. 

There is no doubt in the mind of the present reviewer that the book will 
be sincerely welcomed not only by teachers but also by those students in 
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economics who do not feel quite up to the task of working their way through, 
say, Allen’s Mathematical Analysis for Economists. Projected against the 
background of present-day textbook literature, Rudimentary Mathematics 
for Economists and Statisticians represents what may be considered the prel- 
ude to the inclusion of mathematics as part of the economics curriculum. 
Professors Crum and Schumpeter in their foreword seem to suggest such a 
claim and the book provides ample justification for it. 

In trying to evaluate this book, two questions come to mind: (a) Is there 
need for a book of the nature of Crum and Schumpeter’s Rudimentary 
Mathematics for Economists and Statisticians? (b) Since it is written in the 
fashion of a textbook, or at any rate, in such a way that it would lend itself 
admirably to use as a text, should the existence of such a book lead to the 
establishment of a course of a similar nature, and is such an addition to the 
conventional economics curriculum desirable? 

If the book is used as collateral reading or for purposes of self-study, the 
answer to the first question is an unreserved “Yes.” 

As to the second of the two questions, it will not be easy to reach a satis- 
factory answer. There can be little doubt that mathematics and mathemat- 
ical models of thought are being used to an increasing extent in economic 
literature. This emphasis on economic analysis by quantitative methods is 
no longer limited to the loftier areas of the graduate curriculum, but it has 
been introduced into the rather pedestrian regions of the introductory eco- 
nomics course. At the same time, the training of economics students in the 
understanding and the use of mathematical language is sadly lacking. The 
question of how to remedy this situation is certainly one of the most im- 
portant of those confronting the planners of economic curricula. The prob- 
lem is a pedagogical and an administrative one, permitting many widely 
different answers. Should the student in economics undergo a more thor- 
ough training in mathematics, say, through the calculus as offered by the 
existing mathematics departments? Is such an intensive training, as an in- 
tegral part of a student’s education, feasible for, say, the student in science 
but not for the young economist because of “lack of time” or any other 
pretext or reason? 

This latter belief, although it is rather popular, requires careful thought. 
As in the case of the teaching of statistics to economists, the point of view 
prevailing at the time of introduction of a new course into the curriculum 
may not prove to be altogether happy and desirable in the long run. 


Review BY GERHARD TINTNER 
Professor of Economics and Mathematics, Iowa State College 


bpm book provides an elementary introduction to some methods of mathe- 
matical analysis for economists. The first and second chapters deal with 
graphic analysis and related subjects in analytical geometry and simple 
algebra. They are introduced with the help of a discussion of some problems 
in cost theory. Chapter 3 treats the concept of the limit. The examples are 
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taken from the theory of marginal cost, of demand curves, of marginal 
utility and of marginal revenue. Chapter 4 discusses rates and derivatives. 
This includes various rules for differentiation, which are, however, intro- 
duced without proof. Higher derivatives, Taylor series, partial and total 
derivatives are also included. 

Chapter 5 takes up maxima and minima in one and in severa! variables 
as well as a brief treatment of inflection points. Examples are given from 
cost and production theory as well as from the statistical theory of linear 
regression. Chapter 6 deals with differential equations and their integration 
as well as with definite integrals; also discussed are elasticity of demand, 
compound interest, total utility and some simple problems in economic dy- 
namics. The last chapter gives an introduction to determinants. 

The authors ought to be congratulated for having achieved almost a 
miracle of condensation. A great number of difficult subjects are discussed 
in a very small space. This could only be achieved by presenting many 
mathematical theorems without proof. 

This lack of proof may, however, be confusing to the reader. It would 
have been possible to give, for instance, exact proof of the rule for the dif- 
ferentiation of a power if the binomial theorem had been introduced. The 
authors should have included this theorem also because of its great impor- 
tance for the student of statistics and probability. On the whole, the needs 
of the student of statistics are somewhat neglected; the only statistical sub- 
ject used as an example comes from the theory of simple regression. Other 
proofs for rules of differentiation could also have been given without too 
much trouble and without adding excessively to the bulk of the book. 

This book certainly fulfills a very useful purpose as an introduction to 
calculus for the somewhat mathematically inclined and interested economist. 
It is particularly recommended because it is free from the preoccupation 
with physics and especially classical mechanics which characterizes almost 
all introductory calculus books. A knowledge of these subjects is of no 
great importance to the economist. He will save himself time and unneces- 
sary trouble if he uses this introduction to calculus which utilizes exclusively 
examples from economic theory and mathematical statistics. 

If we compare the present book with other available introductions to 
mathematics for economists, it appears that it is somewhat inferior to 
R. G. D. Allen’s Mathematical Analysis for Economists which is generally 
considered the classical treatment of the subject. Allen’s book covers a 
much wider range of mathematical theory and also gives an almost complete 
survey of the traditional theory of mathematical economics. Another book, 
Elements of Mathematics for Students of Economics and Statistics, by D. C. 
Jones and G. W. Daniels (University Press of Liverpool, London, 1926) is 
also somewhat preferable. But both of these books are written for English 
students who are able to approach college economics with a better high 
school education in mathematics. Hence, it seems that they are not as 
suitable for teaching purposes as the present book. 
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All the examples included are artificially constructed. One may wonder 
why the authors have not used some of the results of recent empirical eco- 
nometric research. These would provide excellent examples for an introduc- 
tion to mathematical and statistical methods. The authors could have used, 
for instance, some of the empirical demand curves established by Henry 
Schultz, the production functions of Paul Douglas and his school, and the 
cost functions established by Joel Dean—to mention only a few possibilities. 
These empirical results should take the place of the artificially constructed 
examples. They could be expected to appeal more to the students of eco- 
nomics and statistics and stimulate in them an inclination towards em- 
pirical research. 


On the Solution of Normal Equations and Related Topics. David B. Duncan 
(Pawlett Scholar, University of Sydney) and John F. Kenney (University of 
Wisconsin at Milwaukee). Milwaukee 3, Wis.: Book Store, University Extension 
Division (623 West State St.), 1946. Pp. iii, 35. $1.00. Paper. 


Review spy Water L. Deemer, Jr. 
Lt. Colonel, Air Corps; Chief, Department of Statistics 
AAFP School of Aviation Medicine, Randolph Field, Texas 


HE preface to this monograph states that it is “intended to serve as an 
"Tuaaieaiees to modern computational methods.... We explain the 
matric theory that is useful in this connection and then describe some of the 
‘compact’ methods which have been developed recently, including the useful 
‘square root’ method.” 

Actually, it is an introduction to modern methods of computing the in- 
verse of a matrix and to computing the solution of a set of simultaneous equa- 
tions. It is not an introduction to modern computational methods in general. 

What the monograph does is to give in 35 pages a very clear and detailed 
explanation on an extremely elementary level of how to solve a set of simul- 
taneous equations and how to get the inverse of a matrix; it also gives the 
matrix theory, so that with a very small amount of study a beginner can 
learn to do the computation quickly and accurately and can also learn why 
each step is taken. 

For someone with no knowledge of matrix algebra, this monograph is an 
excellent introduction to the general concept of matrix theory and its 
application to the solution of simultaneous equations. A thorough study 
of this book will give even the beginner enough background to proceed to 
more advanced material on matrix algebra. 

The first two pages explain how simultaneous equations arise in least 
squares theory. This is followed by twelve pages on the more elementary 
properties of matrices and determinants. The general concept of factoring a 
matrix as a preliminary step to finding the solution of the equation or to 
finding an inverse is then developed. The next fifteen pages are devoted to 
an explanation of the square root method of solving equations and getting 
an inverse. This part of the monograph is particularly good. The exposition 








610 AMERICAN STATISTICAL ASSOCIATION 


is very complete, and enough detail is given, with numerical examples, so 
that a beginner can follow it with little trouble. This is followed by two 
very useful pages on the method of moving decimal points to facilitate 
computation. The pamphlet ends with two pages on nonsymmetric 
equations. 

The authors develop no new theory and make no claim to originality. 
They have quite wisely refrained from trying to explain the wide variety of 
methods now available for computing an inverse. They have chosen a 
method which they believe to be one of the best, and have explained that 
method in detail. This reviewer believes that for routine calculation by 
clerks there are better methods than the square root method, but even if 
this is true, it is not a serious flaw in this monograph since after studying 
this book the beginner may go to the literature (there is a fairly complete 
bibliography given) and make his own evaluation of methods. 

It is hoped that a consolidation of the periodical literature on the theory 
and practice of solving simultaneous equations will soon appear. When it 
does, this book by Duncan and Kenney will be an ideal companion to it. 

Unfortunately, there are a number of misprints. Some of them will cause 
no difficulty even to the beginner. Others may cause the beginner some 
trouble, particularly those which occur in numerical examples where cross 
reference between the data and the operations of solution is made difficult 
by gross errors in transcription. A number of such errors occur on pages 18 
and 20. A list of misprints found by the reviewer follows: p. 6, line 4, for 
column read row; p. 6, equation 3a, for G read G’; p. 13, exercise 1, for super- 
script 5 on MacLane read 2; p. 14, equation 21, first equation, for gi: read 
ti2; p. 18, equation 29, second equation, read (g2—3:2k1) /822 for g2 —Si2k1/822; 
p. 18, line 16 (Si3= --- ), read .183 for .777; p. 18, line 19 (k= --~- ), read 
2.93 for .777, add to end of expression =.777; p. 19, line 10 from bottom, 
for (0.023) read ( —0.023); p. 20, line 17, for A read G; p. 20, line 18, for A 
read G; p. 20, line 6 from bottom, for .777 read 2.93; p. 20, line 5 from 
bottom, for —.037 read +1.06; p. 25, line 8, for .132 read —.132; p. 25, line 
6 from bottom, for (.354)? read ( —.286)?; p. 25, line 3 from bottom, for 
(.039) read (.039)*; p. 30, exercise 1, the matrix given for A~ should have the 
decimal point of each element moved four places to the left; the values of 
the u; are correct as given; p. 32, line 12, for x’ read x; for U’ read U; p. 32, 
line 14, for 10 read 107; p. 33, line 11 from bottom, for P read R. 


Government Statistics for Business Use. Edited by Philip M. Hauser (Assistant 
Director, Bureau of the Census) and William R. Leonard (Deputy Chief, Divi- 
sion of Statistical Standards, Bureau of the Budget). (Washington, D. C.) 
New York 16: John Wiley & Sons, Inc. (440 Fourth Ave.), 1946. Pp. xvi, 432. 
$5.00. 
Review sy Wittiam A. Spurr 
Professor of Business Statistics, Stanford University 


HIs book tells what information is available from the Federal Govern- 
ment, the agencies from which it can be obtained, and ways in which 
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it can be applied to business and economic problems,” according to the 
jacket. More precisely, it is a descriptive reference book for business men, 
somewhat incomplete and loosely organized, but nevertheless a valuable 
guide to the principal types of government data useful in management, 
production and marketing. It is written by twenty government statisticians 
(seventeen of them from the bureaus of the budget, census, and foreign 
and domestic commerce) who are leaders in their fields. The book is properly 
divided by subject matter rather than by government agency. Chapter 
headings include general business indicators, manufacturing, mining, agri- 
culture, wholesale and retail trade, international trade, public utilities, 
corporate financial statements, money and banking, prices, population, 
housing, and labor. 

The treatment is primarily descriptive, with few tables or charts of the 
data themselves. A useful distinction is made in some chapters between 
“benchmark” statistics (chiefly census figures), current data, and special 
surveys. War agency statistics of ephemeral value are properly omitted. 
Units are carefully defined in many instances (for example, “wholesale 
prices,” page 308). Some of the writers also describe the gaps and defects 
in the available figures. 

A praiseworthy effort is made throughout to show the general uses of the 
data in business, chiefly in general management and marketing. Sometimes 
specific applications are also cited. These range from actual cases, such as 
that of a publisher who matched subscription lists against census schedules 
(p. 357), to more unrealistic suggestions, such as that the operator of a 
small grocery store should keep a ten-year monthly chart of five general 
price indexes (p. 318). 

While this book is of average length and contains much excellent ma- 
terial, it is still far from complete as a source book for business statistics. 
Little reference is made to important private sources of data (except for 
minerals) on the insufficient ground that “these sources are generally well 
known” (p. 16). Even within government sources, “it was necessary to leave 
out many statistical series of direct interest to business” (p. 16). Further- 
more, while some topics, such as national income estimates and the popula- 
tion census, are discussed in detail, many others receive such fragmentary 
treatment as interest rates (pp. 287-289) or “Foreign Commerce Weekly con- 
tains frequent articles on various aspects of foreign trade” (p. 199, 
quoted in full). The business economist or statistician may also feel that 
the book is “written down” for the small business man. For example, both 
foreword and introduction attempt to justify the use of statistics themselves, 
and chapter introductions point out the importance of such topics as manu- 
facturing, trade, and prices. 

A few writers understandably stress the virtues of their own bureau’s data 
(for example, national income as a business indicator) without enough atten- 
tion to the limitations of their estimates or the contributions of other 
agencies. Those who discuss the work of others are apt to be most objective 
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and critical. The sections on manufacturing, mining, financial statements, 
and banking are among the best in this respect. 

A final criticism reflects the difficulty of editing the work of many inde- 
pendent contributors. There is some unevenness of treatment, such as in 
the inclusion of illustrative tables, war agency data, subject vs. source clas- 
sification within chapters, case citations, and overlapping of subject matter. 
In some of the latter cases, such as retail sales (pp. 48, 163) and farm prices 
(pp. 114, 305), minor discrepancies indicate that the duplicating sections 
have not been cross-checked. The index, which is an essential aid in using 
a book of this kind, should be more detailed and cross-classified. The bibliog- 
raphy, too, might well be more complete, and classified by subject rather 
than source, as the book is. 

Despite these limitations, the book provides both the business man and 
the statistician with a valuable synthesis of essential government data. With 
the war stimulus to fact finding (tempered by the Federal Reports Act of 
1942), the current ambitious census program, and improvements in sampling 
techniques, the government’s statistical aids to business should continue to 
increase. It is hoped that this volume will be supplemented in the future by a 
still more comprehensive and critical survey of government and private 
contributions to business statistics. 


Changes in Income Distribution During the Great Depression. Horst Menders- 
hausen (Professor of Economics, Bennington College). Conference on Research 
in Income and Wealth, Studies in Income and Wealth, Vol. 7. New York 23: 
National Bureau of Economic Research, Inc. (1819 Broadway), 1946. Pp. xviii, 
168. $2.50. Two reviews follow: 

ReEviEw By Morris H. HANSEN 

Statistical Assistant to the Director 

Bureau of Census, Washington, D. C. 

His study of the changes in income distributicn during the depression 
jet is based primarily on an analysis of income data reported for both 
1929 and 1933 for identical families. These data, which show the joint dis- 
tribution of 1929 and 1933 income for a sample of families in each of 33 
cities, were obtained as a part of the Financial Survey of Urban Housing taken 
during 1934. Although there is little information showing the incomes of 
identical families at different time periods, an effort is made to bring what is 
available to bear on the problem in order to verify the inferences drawn. The 
author obtains summary statistical descriptions of the data and formulates 
and tests hypotheses concerning the nature of income shifts during a period 
of economic fluctuations, with primary attention centered on a period of 
shift from prosperity to depression. The analysis is an intensive one of the 
available data. 

The rest of my remarks will deal primarily with the methods employed 
rather than the economic significance of the findings. 

It is difficult in reading the book to determine whether it is directed pri- 
marily at one interested in the methodology, or at one interested in the eco- 
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nomic theory involved. For one concerned with the methodology, the book 
should be of particular interest as it contains a full exposition of the tech- 
niques of analysis employed, and the reasons for choice of various methods. 
In addition, alternative methods of analysis are carried through to in- 
vestigate whether a different approach to a measurement being made or an 
hypothesis being tested would have led to different conclusions. For example, 
changes in income inequality are studied both through using the coefficient 
of variation as a measure of inequality and the coefficient of concentration 
(based on the mean of the differences between individuals, disregarding the 
direction of the differences). In chapter 2 a particularly interesting method 
of analyzing changes in the sections of the income distribution is introduced. 

For one primarily interested in the particular hypotheses explored and the 
economic significance of the analysis, many useful results are presented; but 
the discussions of methods and of economic interpretations are so inter- 
mixed throughout that it is harder to read than would have been the case 
if the methodological aspects had been concentrated more heavily in the 
appendices and less in the text. 

The author was limited to data already available to him from existing 
sources. As a consequence, he had to use such data with whatever defects it 
might have for his analytical purposes. The sampling carried out in 33 cities 
in the Financial Survey of Urban Housing was subject to a number of 
biases. One can never be completely sure in analyzing such data exactly how 
valid his conclusions will be, but the author attempts to determine and indi- 
cate the probable nature of any biases in his results caused by the kind of 
original data he was working with, and he succeeds in making effective use 
of the available information. 

While high standards of analysis are followed for the most part, there is at 
least one major point in which a serious misinterpretation has been made, 
and a number of minor points in which the analysis may be questioned. 
The principal problem is in chapter 1 in the analysis of changes in the 
average income level of cities. The author draws the conclusion based on a 
regression analysis that cities with low income level in 1929 have a greater 
relative decline in income than cities with high income level in 1929 (see 
pages 19 and 20). He shows that the regression of 1933 average income, Y, 
on 1929 average income, X, has nearly the same slope as the line through the 
origin having a slope of the ratio of the means, but the regression of X on Y 
has a very much steeper slope with respect to the X-axis. He properly con- 
cludes that these two lines would have been identical had there been a 
perfect correlation between X and Y. However, he concludes that lack of a 
perfect correlation arises from errors in the observations and that the true 
line would lic somewhere between these two fitted lines and that cities with low 
income levels in 1929 had a greater relative decline than cities with high 
income levels. I think that the analysis and the conclusion in this case are 
faulty. If the only reason that individual cities did not fall precisely on a 
straight regression line was because of errors in measuring the average 
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incomes in those cities, then the analysis made would be correct. Actually, 
however, if the average city incomes were known precisely for each city for 
each year, there still would be a dispersion about the line and probably 
almost as much dispersion as is observed in his data. The errors in the ob- 
servations probably contribute relatively little to the lack of correlation 
between 1929 and 1933 average incomes. If this is true, the appropriate re- 
gression to have used to compare the average income level in 1933 of cities 
having a given average income in 1929 is the regression of 1933 average 
income on 1929 average income. 

A few other relatively minor criticisms should be made of his remarks on 
methodology. 

On page 23, in discussing the measurement of income inequality, the rela- 
tive merits of the standard deviation and of the mean of all possible differ- 
ences between individual families are compared as measures of inequality. 
It is remarked that the standard deviation is of special utility because of its 
high sampling stability. This high sampling stability, in fact, is an advantage 
of the standard deviation in sampling from normal and certain other popu- 
lations, but the standard deviation may or may not have higher sampling 
stability than other measures of inequality when sampling from abnormal 
distributions such as the income distribution. 

In a few instances throughout the volume, tests of significance are made 
separately for each of the 33 cities, and the conclusion is stated in terms such 
as, on page 79, “the correlation is statistically significant on the 5 percent 
probability level in the three cases that show the larger coefficients of cor- 
relation.” The individual tests of significance based on the larger differences 
observed among a set of observations are improper. 

Again, on page 104 and the top of page 105, the statement is made that a 
proper test of discrepancies of certain observations from a line of regression 
would entail an analysis of variance for each city sample but that for tech- 
nical reasons this type of analysis was not carried out. The technical reasons 
cited were that the distributions were far from normal. Actually, the type of 
test proposed will be a valid test when dealing with large samples whether or 
not the individual observations are normally distributed, and large samples 
were being used in this instance. However, the process that was followed for 
dealing with this problem was quite adequate so that no confusion in the 
analysis or in the conclusions that were drawn resulted. 

There are other minor errors that have no significant effect on the conclu- 
sions from the analysis. One or two slips in the text may cause confusion. 
On page 160, formula (1; 2) should be o? instead of c. On page 163, it is 
stated that “It can easily be sl.own that the area of the polygon AEFG is: 
4 Iri(qi+qi-1)”; actually the polygon referred to should be ADGFE. 

On the whole, the analysis is a careful and scholarly one and contains an 
excellent beginning in a field where more analysis and improved analytical 


tools are needed. 
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Review By JoHN LINTNER 
Assistant Professor of Finance, Harvard University 


u1s book is a valuable addition to the series of studies of income and 
yf pontne by members of the staff and consultants of the National Bureau 
of Economic Research. In the present volume, Dr. Mendershausen adds 
very substantially to our knowledge of fluctuations in the size distributions 
of family money incomes between prosperity and depression. The study is 
based primarily on an analysis of the valuable data on incomes of identical 
families in 1929 and 1933 supplied for 33 large and middle-sized cities by 
the Financial Survey of Urban Housing. Information from this source is 
supplemented by income tax statistics for the United States, Wisconsin and 
Delaware, and earnings data from the Social Security Board and German 
income and wage data. Dr. Mendershausen’s work is thus based on a broader 
range of data, better suited to his purpose of analyzing the changing struc- 
ture of the entire range of the income distribution, than has been used in 
any previous study. The material is handled in a careful and competent 
manner. Moreover, the whole analysis is enriched by a considerable insight 
into the economic forces which produce the observed changes in the over-all 
distribution of incomes, and the interplay of changes in different parts of 
the income structure. The important new information and analytical results 
presented in this study, as well as the suggestive discussion of their sig- 
nificance, and the problems raised for further investigation make this work 
important reading for all economists. While at many points the exposition 
is involved and difficult, readers of the study will be well repaid for their 
effort. 

Economists are concerned with changes in the distribution of income 
receipts because of their significant welfare implications, and also because 
shifts in the distribution of income may be expected to affect the volume 
of aggregate demands for consumer goods as well as the volume and com- 
position of new saving. Through these factors, changes in the distribulton 
of income will influence the level of total income payments, business activity, 
and employment. Shifts in the distribution of income influence aggregate 
demands for consumers’ goods and the volume of saving because different 
income groups tend on the average to spend differing proportions of their 
income on various goods and correspondingly to save differing proportions 
of their income. In order to refine their estimates of the total demand for 
consumers’ goods and the level of savings that will tend to be associated 
with different levels of total income payments, economists therefore need 
to know: first, how the total income will be distributed between different 
income groups; and second, how these different income groups will tend on 
the average to allocate their income between different types of saving and 
expenditures on various consumers’ goods and services. Mendershausen’s 
work is concerned exclusively with the first of these two types of information. 

His important contributions in this respect are three in number. In the 
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first place, his results indicate that, contrary to previous impression, the 
relative dispersion or “inequality” of the income distribution as a whole 
tends to vary inversely with the level of total income payments over the 
cycle. Incomes tend to be relatively more unequally distributed in depres- 
sion than in prosperity. The Financial Survey data, dealing with the decline 
from 1929 to 1933, show that in most all of the cities income inequality was 
greater in the latter year, and the supplementary data studied tend gen- 
erally to show a similar tendency in periods of falling incomes. These sup- 
plementary data likewise generally show a decline in the inequality of the 
entire distribution in years of rising income, although in this case the ex- 
ceptions are somewhat more frequent. Some additional support for the 
hypothesis that income inequality declines with rising incomes may be 
found in the recent sampling survey conducted by the Department cf Agri- 
culture and the Federal Reserve Board, which indicates that for the United 
States as a whole the Gini index of inequality stood at .38 in 1945 as com- 
pared with .47 in 1941 and .46 in 1935-36. However, the apparent failure 
of inequality to decline between the latter two years again casts some 
doubt on the validity of the hypothesis. 

In this connection, Mendershausen goes on to show that the general 
tendency he finds for the inequality of the entire income distribution to vary 
inversely with the level of total incomes is perfectly consistent with 
the findings of earlier studies that (a) the inequality in the distribution of 
incomes over $5,000, and (6) the share of the top 1%-5% of income re- 
cipients in total income payments both vary directly with the level of ag- 
gregate incomes. Indeed, Financial Survey data show both these latter char- 
acteristics which relate, however, only to changes in the inequality of the 
upper part of the income scale. As Bortciewicz had shown, changes in the 
inequality of the entire income distribution depend on the (weighted) changes 
in inequality within both the upper and lower groups of incomes as well as 
the changes in the relative mean incomes between the groups. Without 
exception, Mendershausen’s data show that the inequality within the lower 
50%-70% of income recipients was greater in 1933 than in 1929, and that 
the relative difference in mean incomes of the two groups had increased. 
These two effects were sufficiently strong in most cases to outweigh the re- 
duced inequality among the upper range of incomes. Inequality of the whole 
distributions consequently increased. This analysis is important because it 
emphasizes the danger of imputing changes in part of the distribution to 
the whole range of incomes, and because Mendershausen’s generalization, 
if later confirmed more fully, would bring the empirical results of Polak’s 
and other studies of the effects of changing inequality of incomes on con- 
sumption expenditure into agreement with the established body of economic 
analysis. 

Mendershausen’s second major contribution lies in his effort to explain 
the observed changes in the quantitative characteristics of different segments 
of the size distribution of total family incomes by establishing patterns of 
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change within the upper and lower ranges of the separate distributions of 
different qualitative types of income. For instance, the greater inequality 
among lower incomes during depressien is traced back to (a) the expansion 
of unemployment, (b) the greater importance of the income gap between 
the employed and the unemployed, (c) the unequal incidence of unemploy- 
ment among the lower and the more highly paid workers, and (d) the increas- 
ing mean difference in the wage rates of high and low-pay jobs. This analysis 
represents a significant advance in our understanding of changes in the 
lower part of the income distribution. 

Similarly, Mendershausen explains the changing distribution of the upper 
groups of income in terms of (a) the tendency for income from property to 
fluctuate cyclically more than income from work and (b) the fact that very 
high incomes usually include more income from property than income from 
work. While this explanation is probably acceptable in very general terms, 
it is marred by the fact that property incomes are made up of such hetero- 
geneous types of return as dividends, rents and royalties, interest, capital 
gains minus capital losses, and “other income.” These different types of 
property income are known to have different patterns of change over the 
cycle, and the entire explanation of the observed changes in the distribu- 
tion of large incomes would have been substantially improved if these dif- 
ferences in the behavior of different types of property income had been 
studied and introduced into the analysis. 

This effort to establish patterns of change in the distribution of different 
qualitative types of incomes represents a most constructive step. Menders- 
hausen simply doesn’t push his analysis of the behavior of different factoral 
incomes far enough, particularly in the case of property incomes. Neverthe- 
less, he presents a very suggestive and helpful analysis as far as he goes, and 
it is to be hoped that his work will stimulate further study along these lines. 
Surely this type of analysis is an indispensable supplement to the study of 
“income generating functions” suggested by Marschak in his preface. 

The third important contribution of Mendershausen’s study is to provide 
for the first time in the literature a study in quantitative terms of the changes 
in the relative position of individual families in the income scale between 
prosperity and depression. Everyone’s income does not change in the same 
proportion over the cycle: some families in each income group in a base year 
gain in relative position, and others drop behind. Data on these shifts are 
provided by the Financial Survey in the form of a joint distribution of 
sampled families by their income in both 1929 and 1933. Mendershausen 
finds that both the absolute and relative dispersion of the 1933 incomes 
of those families in the highest and lowest income groups in 1929 are greater 
than for the 1933 incomes of the central income groups of 1929. Interestingly 
enough, the diversity of the high-income families of 1933 with respect to 
their former incomes is less than the dispersion of the top-income families 
of 1929 with respect to their subsequent incomes. Perhaps the most sig- 
nificant findings relate to the analysis of “favored” and “disfavored” groups 
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—those income groups whose mean income in the second year is greater or 
less than would have been expected if allowance were made only for changes 
in the level of total incomes, the changed inequality of the total distribution, 
and the shifts in rank of the distribution as a whole. On the basis of this 
standard, the recipients of very low and moderately high incomes in 1929, 
fare relatively better during the depression than those with moderately low 
and extremely high incomes in 1929. These differences are again explained 
in terms of the relative stability of different qualitative types of income 
return. 

This analysis of the changing relative position in the income scale breaks 
new and significant ground; but more studies will be needed to test the sta- 
bility of the pattern shown for the 1929-33 experience. Moreover, the 
full value of this type information cannot be realized until budget data are 
available for the average expenditure patterns of members of given year 
income groupings cross-classified by previous years’ incomes. 


Five Week Patterns of Prices and Volume: A Classification of the Weekly Move- 
ments of Trading Velocity and the Dow Jones Industrial Averages through the 
Twenty Year Period 1926-1945. Arthur A. Merrill. Schenectady 8, N. Y.: the 
Author (1567 Kingston Ave.), 1946. Pp. ii, 32. $1.00. Paper. 


REVIEWED BY OWEN ELy 
Editor, The Analysts Journal 
61 Broadway, New York 6 


HIs 34-page pamphlet consists of a two-page explanatory preface and 32 
, pores of tables. The author has attempted to determine whether it is 
possible to forecast the weekly trend of the stock market by studying (1) 
the up or down changes in the Dow Jones industrial price average, and (2) 
the changes in trading velocity (as measured by shares per trading hour), 
in each of the five preceding weeks. Only changes of trend were considered, 
without regard to the amount of change. The study covers the period 
January 1, 1926 to May 30, 1946, which includes 1,063 five-week periods 
(one beginning each week). The varying pattern of changes in price and 
volume for each five-week period have been studied in relation to the trend 
of prices in the week immediately following. 

There are 32 basic patterns, representing the number of combinations 
of up and down for a five-week period. Each page is thus devoted to a single 
price pattern, which appears at the top of the page. Each price pattern may 
be combined with 32 potential volume patterns of the same character, and 
all of the latter are repeated on each page. For example, page 1, item 1, 
pictures a five-week period in which both price and volume gained steadily; 
while on page 32, item 32, both prices and volume were lower in each con- 
secutive week. In between there are 1,022 other combinations of price- 
volume patterns. Opposite each pattern there is indicated the number of 
times the market advanced in the following week and the number of times 
it declined. 
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Evidently this system proved somewhat disappointing, for Mr. Merrill 
decided to add the results when volume patterns varied (from the one pic- 
tured) with respect to one week out of the five. Thus the number of potential 
forecasts would be nearly doubled. In processing the results for forecasting, it 
was obviously necessary to select only outstanding results where price 
changes (in the week following the five-weeks test period) showed a pre- 
ponderance of ups or downs. As to the value of the forecasting results, the 
report seems rather diffident and incomplete. The writer in the last year and 
a half found that he had “good luck” where the difference in the number of 
ups and downs was two or more, or where there was a difference of five or 
more for the “near” volume patterns. In this period there were eighteen cor- 
rect forecasts and four incorrect, so that the forecasts obtained were 83 per 
cent accurate; but since 17 per cent would involve losses, the net favorable 
result would be about 67 per cent. However, the failure to report any data 
on the actual amounts of gains and losses, with allowance for dividend in- 
come and cost of trading (including taxes), makes the conclusion mainly 
of theoretical interest. 

Moreover, applying the same rules (differences of two and five or more, 
in the actual and near results) to the entire period covered by the book, 
1,063 weeks, definite forecasts were apparently obtainable in only 191 weeks 
or 18 per cent of the time, as compared with 30 per cent in the short period 
reported by the author. No attempt was made to work out the value of the 
forecasts over the whole period as this would involve considerable study 
by this reviewer. 

This system of forecasting is purely mechanical; it appears likely that 
much more valuable results would be obtainable if the amount of change in 
price and volume, instead of merely the direction of the change, could be 
used in a forecasting formula. This would, of course, involve a tremendous 
number of computations; but the problem would be relatively simple for 
some of the wonderful new computing machines now available. Even the 
use of IBM equipment, with punched cards, would solve the mechanical 
problem (Mr. Merrill’s tabulations were prepared “with the aid of a home- 
made calculating machine”). 

The number of research projects which could be developed to aid fore- 
casting stock market trends is limited only by the number of permutations 
and combinations of published data relating to the stock market—or by the 
energy and time possessed by the students of forecasting techniques. The 
results stated in Mr. Merrill’s study are so fragmentary and incomplete 
that his little brochure has little practical interest, except that it may open 
up a new field of study for students of the market. 
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Readers are invited to submit letters about statistical methodology books for 
publication in this forum. Concise, informative letters which supplement 
previously published reviews by pointing out specific strengths, weaknesses, 
errors, and errata in currently used books are wanted. Criticisms based on 
actual use of a book as a text are especially desired from statistics instruc- 
tors. Other letters may consist of suggestions for the writing of books and 
reviews. Letters which contain adverse criticisms of JOURNAL reviews will 
be submitted to the author of the review for any reply he may care to make. 
Contributors are requested to avoid personalities. The right to decide whether 
a letter merits publication is reserved. Letters should be sent to the review 
editor, Oscar K. Buros, Rutgers University, New Brunswick, N. J. 





FREQUENCY ARRAYS 


Mo than a decade has passed 
‘i since I first came upon a copy of 
the 48-page booklet by H. E. Soper en- 
titled Frequency Arrays: Illustrating 
the Use of Logical Symbols in the Study 
of Statistical and Other Distributions, 
which was published by the Cambridge 
University Press in 1922. What Soper 
calls “arrays” are more commonly 
known today as “generating func- 
tions,” a terminology introduced by 
Laplace, who employed them sys- 
tematically in his great work, Théorie 
Analytique Des Probabilités (1st edi- 
tion, Paris, 1812). So far as I have been 
able to determine, Frequency Arrays 
was the first elementary exposition of 
generating functions written in English 
expressly for students of mathematical 
statistics. At any rate, it was the first 
exposition of this subject that I came 
upon, and I was eager to own a copy. 
I soon found, however, that it was 
generally regarded by booksellers as 
being out-of-print, and that virtually 
none of my statistical friends in the 
United States knew of this booklet. 
During my visit to England, 1935- 
37, I continued my inquiries about 
Frequency Arrays, and learned ulti- 
mately that possibly a copy might be 
obtained from L. Reeve & Co. Ltd., 
Publishers, in Ashford, Kent. This 
proved to be the case, and prior to 
World War II, three or four of my 
statistical friends and I obtained copies 
from this source. In May of this year 
I wrote to the above publishers to in- 
quire regarding the possibility of ob- 
taining additional copies of Frequency 
Arrays, since my statistical friends 
continued to exhibit a desire to pur- 
chase copies of this booklet. In reply 
to my letter, the publishers stated that 
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“as we have a fairly large stock of this 
work still left we can supply further 
copies if required.” 

Frequency Arrays was reviewed in 
this JouRNAL in 1923 (Vol. 18, pp. 
1073-1074) by John Rice Miner, and 
contains a résumé of Soper’s symbolic 
method of treating probability-gener- 
ating and moment-generating func- 
tions. The review concludes with the 
following paragraph: 


“This résumé will suffice to give the 
reader some idea of the methods in- 
volved. Mr. Soper then illustrates their 
use by applications to the binomial, 
Poisson, Gaussian, exponential, gamma 
type and hypergeometric frequencies, 
as well as to geometrical distributions 
and problems ef random migration. 
In this, as in other fields, the symbolic 
method proves itself a powerful engine, 
but whether it will come into more gen- 
eral use is for the future to determine.” 


In spite of this favorable review, Fre- 
quency Arrays appears to have been 
fairly wideiy ignored in the United 
States. On the other hand, in Soper’s 
obituary, by M. Greenwood, published 
in the Journal of the Royal Statistical 
Society (Vol. 94, 1931, pp. 135-141), we 
find the following: 


“|. . the work which will surely keep 
his memory green when those who 
knew and loved him are forgotten is 
Frequency Arrays. The Royal Statisti- 
cal Society has at least the credit of 
proving an exception to the rule that 
the prophet is not without honour save 
in his own country. It is true that in 
other English journals, Frequency Ar- 
rays was spoken of politely. But the 
only notice in an English journal which 
I have seen conveying the impression 
that Soper had made a first-rate con- 
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tribution to statistical methodology 
appeared in our Journal over the ini- 
tials L. I. (1923, LXXXVI, 67), and 
the Council in awarding the Guy 
Medal made mention of this tract. 
Continental statisticians did not con- 
fine themselves to vague politeness. 
The French Academy of Sciences 
awarded Soper a Montyon Prize, and 
the late Professor Tschuprow wrote a 
review (Nordisk Statistisk Tidskrift, 
1924, III, 414-17) in which he charac- 
terized the tract not, as some other 
reviewers hinted, as an ingenious way 
of setting out what everybody knew 
before, but as ‘einen namhaften Beitrag 
zur formal mathematischen Ausrus- 
tung des Statistikers’.” 


Today there is another little book by 
a British author which, by virtue of 
having some exercises for the student 
to work, is probably better suited than 
Frequency Arrays to class-room in- 
struction in generating functions. I 
refer to A. C. Aitken’s Statistical 
Mathematics (New York: Interscience 
Publishers. Ist edition, 1939; fourth 
edition, 1945. $1.50), which was re- 
viewed by S. S. W ilks in this JouRNAL 
(Vol. 36, 1941, pp. 148-149). Never- 
theless, in my experience, teachers of 
mathematical statistics, and others, 
seem still to find Frequency Arrays a 
stimulating little volume, and many 
desire to own a copy. 

Therefore, I take pleasure in inform- 
ing readers of this JouRNAL generally 
that copies of Frequency Arrays can 
still be purchased from L. Reeve & Co. 
Ltd., Sankey House, Brook, Ashford, 
Kent, England; also from the Abbey 
Garden Press, 132 West Union Street, 
Pasadena, California. The English 
price is 3s. 6d., postage 2d.; the 
American price is 85 cents postpaid. 

CHURCHILL EISENHART, Associate 
Professor of Mathematics and 
Biometrician, University of Wis- 
consin; Principal Mathematician, 
National Bureau of Standards 


A NOTE ON STATISTICAL BOOKS 


iy REVIEWING statistics books it is 
customary to discuss such topics as 
technical accuracy, content, organiza- 
tion, balance, and suitability. There is 
another factor which is even more basic 
in writing or appraising a book—the 
statistical creed of the author. At pres- 
ent there are three major interest 
groups in the field of statistics, each 
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with its own “frame of reference”: the 
mathematical statistician whose pri- 
mary interest is the theory of statistics, 
the operating statistician who applies 
statistical principles and techniques to 
the solution of current problems of 
government and business, and the sub- 
ject-matter specialist to whom statis- 
tics is a tool subordinate to his major 
interest. The prevailing pattern of sta- 
tistical thinking has been, and still is, 
that of the latter group, although the 
first group is becoming more and more 
influential. The second group is small, 
disorganized, and ineffectual profes- 
sionally, although it may be the nu- 
cleus of a real profession of practicing 
statisticians. 

The conflict between the first two 
groups is a conflict which exists in any 
science between theory and practice, a 
conflict which may be helpful in mak- 
ing theory more vital and practice 
more efficient. The real issue in statis- 
tics, however, is between the first two 
groups combined and the third group, 
since this is a conflict between a grow- 
ing theoretical and applied science, on 
the one hand, and an older and out- 
moded statistical creed, on the other. 
The basic issues between these two 
groups may be summarized as follows: 
The first group thinks of statistics as a 
theoretical and applied science in its 
own right. The second group does not. 
The first group believes applied statis- 
tics should be a full-time profession, 
that statistical functions should be 
assigned to the statistician. The second 
group does not. The first group sees 
the statistician as an expert in statisti- 
cal inference and estimation based 
upon sampling, and in statistical in- 
quiry generally. The second group be- 
heves that the subject-matter special- 
ist is the only person competent to 
make estimates and inferences about 
his subject matter. 

The compartmentalization of sta- 
tistics into economics, sociology, psy- 
chology, education, biology, business, 
ete. has set up barriers to the full utili- 
zation of efficient statistical principles 
and practices. If the biometricians and 
agriculturists have done better than 
the others, it is simply because they 
were fortunate in having an R. A. 
Fisher working in these fields. This 
compartmentalism has led to a number 
of strange notions such as (1) Fisher’s 
work is limited to small samples, 
(2) analysis of variance and the design 
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of inquiry are applicable only to agri- 
culture and biology, and (3) random 
sampling cannot be used in the social 
science fields. Fisher himself has con- 
tributed to this belief in compartmen- 
talism for the tables by Fisher and 
Yates are called Statistical Tables for 
Biological, Agricultural, and Medical 
Research, although few of the 34 tables 
are limited in their applications to the 
three fields mentioned in the title. 
Compartmentalism has done more. It 
has led to statistical confusion by in- 
troducing 57 varieties of statistical 
symbols, by keeping alive erroneous 
theory and questionable practice, and 
by consistently refusing to recognize 
that statistics is a science composed of 
principles rather than a tool like typing 
and shorthand that one can pick up in 
a few months. To show how slow our 
subject-matter specialists are in keep- 
ing up with statistics, the 1946 edition 
of Encyclopedia Americana in its lists 
of books under the topics of Statistical 
Method and Statistics carries no ref- 
erence to R. A. Fisher, Statistical 
Methods for Research Workers (first 
edition 1925), although it contains the 
names of several books of decidedly 
lesser importance. 

Compartmentalism seems to owe its 
existence and development to the 
graduate schools because their special- 
ists discovered that statistics had a 
role to play in academic research. Sta- 
tistics books were written in each field 
to meet this academic need. As time 
went on, it was assumed that these 
books were valid sources of training for 
all kinds of statistical work. The oper- 
ating and research statistician in gov- 
ernment and business knows how 
absurd this assumption is. The 6-credit- 
hour “statistician,” even though offi- 
cially recognized by the U. S. Civil 
Service Commission in 1939, is inade- 
quately trained both in statistical 
theory and in statistical practice. What 
we need is a new approach to statistics 
based, not upon the patterns of aca- 
demic research, but upon the needs of 
government and industry. 

Nor is there any greater likelihood 
that the results will be better in the 
future since both statistical theory and 
practice are growing at a lively pace. 
As a matter of fact many of the prob- 
lems now facing government and busi- 
ness and industry require an expert in 
statistical, especially sampling, design, 
not an economist or sociologist who 
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has had a course or two in statistics, 
Indeed there is a very important field 
of professional statistical practice 
which seems to be neglected, for which 
training in mathematical statistics is a 
necessary, but not a sufficient, condi- 
tion for qualification. In other words 
neither knowledge of a subject nor 
knowledge of mathematical statistics, 
by itself, is enough. 

With regard to the problem of sta- 
tistics books I should like to make the 
following suggestions: 

1. The first course in statistics 
should deal with general principles and 
should cover the fields of sampling, 
estimation, and inference in their more 
elementary aspects. This course should 
be taken by all persons regardless of 
subject-matter specialization. It is 
analogous to the first course in college 
chemistry or physics; it gives the 
proper orientation but it is hardly 
more than an introduction to the sub- 
ject. 

2. Advanced courses in these prin- 
ciples, as well as their applications to 
special fields, should follow this course 
on principles. In this way a person 
will not become so easily the victim of 
rules-of-thumb which are often ineffi- 
cient, but will learn to think in terms of 
principles applicable to problem situa- 
tions. Books should be prepared along 
these two lines of theory and applica- 
tions. 

3. It is absurd to think that one can 
be a statistician without an under- 
standing of mathematics at least 
through the calculus. Mathematics 
should be a prerequisite to statistical 
work. Prepare special nontechnical 
books for the laymen and others inter- 
ested in the field. We should not try to 
meet all statistical needs in one type of 
book. 

4. Have more books prepared by the 
second group—those on the firing line 
in statistics. For a generation we have 
had stereotyped textbooks prepared by 
academic subject-matter specialists; 
more recently the mathematical statis- 
ticians have moved into the market. 
The former group is weak on theory, 
the latter group is weak on practice. 
What we need are books that show how 
to apply principles to the solutions of 
operating problems. “Applied” statis- 
tics to date has been largely limited to 
academic research. Books by Fry, 
Shewhart, and Simon are examples of 
the type of book we need. 














‘ION 


tics. 
field 
‘tice 
hich 
isa 
ndi- 
ords 
nor 
tics, 


Sta- 
the 


tics 
and 
ing, 
ore 
uld 
; of 


ege 


Dy 


'S- 











LETTERS ABOUT BOOKS 


5. Write books to inform, rather 
than to impress, the reader. Most 
books are simply a hodge-podge of 
topics. We need an integration of sta- 
tistics to give it meaning. Where 
mathematical derivations are given, 
they are sometimes so involved that 
the logical assumptions on which the 
derivations are based are hardly men- 
tioned, let alone described. Stating as- 
sumptions in mathematical language 
does not seem to be enough. More at- 
tention needs to be given to statistical 
principles and their applicability to 
real problems. Then too there is a de- 
cided lack of balance in some books. 
For example, one book gives about 200 
pages to index numbers and periodicity 
analysis, while another gives about 50 
pages to rank order correlation, but 
neither devotes anything to stratified 
random sampling as such, a much more 
fundamental problem. 

A. C. RosanpeEr, Statistical Divi- 
sion, Bureau of Internal Rev- 
enue, Washington 25, D.C. 


JASA REVIEW POLICY 


I OFFER some suggestions to be taken 
into account in any reformulation 
of the preliminary statement of Jour- 
NAL review policy appearing in the 
August 1945 issue of the ASA Bulletin, 
especially the expressed objectives: 


a) To provide readers with a schol- 
arly appraisal of the book for their own 
reading guidance. 

b) To stimulate progress toward 
higher professional standards among 
statisticians by praising good statisti- 
cal writing and scholarship and by cen- 
soring poor statistical writing and 
scholarship. 


In view of the increasing tendency to 
specialize, it would appear to be a sig- 
nificant and valid function of the 
JOURNAL review to promote a catholic- 
ity of interest on the part of specialists 
in other branches of the science, es- 
pecially through emphasis on the wider 
applicability of concepts and meth- 
odology characteristic of one or an- 
other specialized field. It would also 
seem reasonable to expect the review 
to give guidance to authors as well as 
to readers, through commentary and 
generalization which need not imply 
praise or censure of the work under 
consideration, Attention might be di- 
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rected, not only to the individual au- 
thor’s errors great and small, but also 
to the areas still to be explored, to the 
syntheses still to be made, to perspec- 
tives which should become more gen- 
eral, to the inevitable anachronisms 
introduced by the mixture of new and 
classical viewpoints in the treatment 
of different phases of statistics, to the 
existence of important contributions in 
periodicals or in foreign-language lit- 
erature which are usually overlooked, 
etc. Furthermore, there is still room 
for the “creative” review, which goes 
on to make an original contribution 
inspired by some aspect of the work 
under scrutiny. 

The traditional kind of review may 
be of little value to the nonstatistical or 
semistatistical reader who has a lim- 
ited technical background but is nev- 
ertheless interested in, or working in, 
some phase of the science. Such read- 
ers, according to the June 1946 Jour- 
NAL, would seem to compose the large 
majority of Association members— 

erhaps, a still larger majority if the 

Yational Roster definition of a statisti- 
cian (ASA Bulletin, May 1945) were 
applied. Furthermore, many statisti- 
cian members, trained before the 
Fisher-Neyman revolution, have an 
unsure grasp of the newer statistical 
tools and follow contemporary litera- 
ture with difficulty. The elevation of 
professional standards depends to a 
large extent on the successful develop- 
ment of a sound but relatively non- 
technical literature to meet the needs 
of the nonstatisticians, the semistatis- 
cians, and the quasi-statisticians in the 
Association. Special attention should 
be given to the review of such litera- 
ture, which is not specifically men- 
tioned in the August 1945 list of four 
types of publications to be reviewed in 
the JOURNAL. 

All members, ranging from the non- 
statistician to the specialist, could 
benefit from “review articles”—essays 
which summarize in systematic, in- 
tegrated form the accomplishments in 
a branch of statistics to date, or which 
present more or less authoritative 
statements on fundamental notions of 
the science, general methods, etc. Such 
essays should be written by invitation, 
probably by more than one person or 
with the guidance of a committee of 
broad perspective. 

Finally, it would seem desirable 
to construe the objectives of reader 
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guidance and standards elevation to 
include, as a constructive aspect, 


the formulation of “model” outlines 
for textbooks and, perhaps, other for- 
mal works on statistical topics. Such 
outlines should take account of the 
observations made on similar works by 
JOURNAL reviewers. Their preparation 
and periodic revision should, perhaps, 


AMERICAN STATISTICAL ASSOCIATION 


be assigned to a standing committee of 
the Association, one which might also 
be concerned with standards for the 
training of future members of the sta- 
tistical profession. 


Irving H. Srecer, Chief, Eco- 
nomics Division, Veterans Ad- 
ministration, Washington, D. C. 
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A collection of estimates of national product, gross and net, totals and by final use com- 
ponents. 


256 pages (6” x 9”) 74 tables $3.00 


Output and Productivity in the Electric and 
Gas Utilities, 1899-1942 
By Jacob M. Gould 


A study of changes in output, and in the relation of output to use of labor, capital, materials 
and fuel, in the electric, manufactured gas and natural gas industries. 


208 pages (6” x 9”) 42 tables 22 charts $3.00 


Measuring Business Cycles 
By Arthur F. Burns and Wesley C. Mitchell 


A book for laymen and economists with a general interest in business cycles; for students of 
business cycles; and for statisticians interested in time-series analysis, measurement of eco- 
nomic magnitudes, and testing hypotheses regarding time series. 


592 pages (8” x 12”) 199 tables 77 charts $5.00 


Economic Research and the Development of 
Economic Science and Public Policy 
(Volume 3 in the Twenty-fifth Anniversary Series) 
Twelve papers presented at the Twenty-fifth Anniversary Meeting of the National Bureau of 


Economic Research, New York, June 6 and 7, 1946 by distinguished investigators from Eng- 
land, France, Holland, Sweden and the United States. 


208 pages (6” x 9”) $1.00 


National Bureau of Economic Research 
1819 Broadway New York 23, N.Y. 
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McGraw-Hill Books. of. Timely Importance 





SAMPLING INSPECTION 
By the Statistical Research Group, Columbia University. In press—ready in 
January. 


A systematic account of certain of the best current inspection practices, together with 
tables and detailed instructions for carrying out these practices. 


SELECTED TECHNIQUES OF STATISTICAL ANALYSIS 


For Scientific and Industrial Research and Production and Management 
Engineering 
By the Statistical Research Group, Columbia University. In press—ready in 
January. 
Deals with a series of problems which occur frequently in planning, analyzing, or 


interpreting quantitative data. Various techniques appropriate to these problems are 
explained in terms of both general principles and specific procedures. 


STATISTICAL QUALITY CONTROL 
By Eucene L. Grant, Stanford University. McGraw-Hill Industrial Organization 
and Management Series. 475 pages, $5.00 


Presents a working manual which explains simple but powerful statistical techniques 
that can be widely used in industry to reduce costs, improve quality of products, and 
obtain a better coordination between design, production, and inspection. The book 
explains the Shewhart control chart and its use. 


ELEMENTARY STATISTICS AND APPLICATIONS 
By JAmes G. SmitH and Acueson J. DuNcAN, Princeton University. Vol. I of 
Fundamentals of the Theory of Statistics. 720 pages, 5% x 8%. $4.50 


Designed to provide text material for the first course in general statistics, this volume 
presents methods of summarization and comparison, frequency distribution analysis, 
the normal curve, simple linear and nonlinear correlation, multiple and partial cor- 
relation, time series analysis, and forecasting. 


SAMPLING STATISTICS AND APPLICATIONS 


By James G. SmirH and Acueson J. Duncan. Vol. II of Fundamentals of the 
Theory of Statistics. 491 pages, 5% x 8%. $4.25 


Intended for advanced students or research workers. After reviewing basic concepts 
and definitions, the authors discuss the general theory of frequency curves and the 
theory of random sampling. Important sampling distributions are derived and their 
applications to a variety of problems are illustrated. 


Send for copies on approval 


McGRAW-HILL BOOK COMPANY, INC. 
330 West 42nd Street New York 18, N.Y. 
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A New Book in Statistics 


MATHEMATICAL METHODS 
OF STATISTICS | 


BY HARALD CRAMER 


While British and American statisticians have been develop- | 
ing the science of statistical inference, French and Russian | 
probabilitists have transformed the classical calculus of prob- 
ability into a rigorous and pure mathematical theory. Cramér 
has joined these two lines of development in a masterly expo- 
sition of the mathematical methods of modern statistics. 


For anyone with a working knowledge of undergraduate 
mathematics, the book is self-contained, since the first part is 
an introduction to the fundamental concept of a distribution 
and of integration with respect to a distribution. 





The second part contains the general theory of random 
variables and probability distributions, while the third is de- 
voted to the theory of sampling distributions, statistical estima- 
tion, and tests of significance. 


570 pages $6.00 


| MATHEMATICAL STATISTICS 


BY S. S. WILKS 


“An excellent introduction to the more recent developments 
of the mathematical theory of statistics.”—Journal of the Ameri- 
can Statistical Association. 





284 pages Planographed $3.75 


At your bookstore 


PRINCETON UNIVERSITY PRESS 
PRINCETON, N.J. 
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ECONOMETRICA 


Journal of the Econometric Society 
—_—@———— 


Contents of Vol. 14, No. 3, July, 1946, include: 


PauL A, SAMUELSON: Lord Keynes and the General Theory .......... ono Sar 
RuTLepce Vinine: The Region as a Concept in Business-Cycle Analysis .. 201 
Georce R. Davies: Pricing and Price Levels ............eeeeeeeeeeeees 219 
Tuomas C, ScHetuine: Raise Profits by Raising Wages? ..............- 227 
Erik Rust: Standard Errors of the Tilling Coefficients Used in Confluence 

ee LiaSUUEeA RAGA EEMEN ESKER edAaOdeareRee +. ae 
Stuart A, Rice: The United Nations Statistical Commission ............ 242 
Acueson J. Duncan: “Free Money” of Large Manufacturing Corporations 

and the Rate of Interest ........... Use keseenesaceseenekess 251 
AvraM KissEtcorr: “Free Money” of Large Manufacturing Corporations 

and the Rate of Interest: A Reply ............ jeietintaanees Te 

a 

Published Quarterly Subscription: $7.00 per year 


The Econometric Society is an international society for the advancement of eco- 
nomic theory in its relation to statistics and mathematics. 
Subscriptions to Econometrica and inquiries about the work of the Society and the 
procedure in applying for membership should be addressed to Alfred Cowles, Secretary 
= Treasurer, The Econometric Society, The University of Chicago, Chicago 37, 
inois. 











JOURNAL OF FARM ECONOMICS 
Published by THE AMERICAN FARM ECONOMIC ASSOCIATION 
Editor: Warren C. Waite 
University of Minnesota, University Farm, St. Paul 1, Minnesota 
Volume XXVIII, November 1946, Number 4 

Some of the major articles are: 
Notes on Developments in Agricultural Policy and Program in the United 


DE ie omuuccinaMaceenacwewaavkamas cp aaaaaeee John D. Black 
Government Egg Programs during Wartime ............+- ..-Gerson Levin 
SG NR sc ccncedccdecesaeuniesatieareeenen Willard D. Arant 
The Tobacco Program: Exception or Portent ........... Charles M. Hardin 
Can Prices Allocate Resources in American Agriculture ..............+. 


CAS eee eran tee ES oy J. M. Brewster and H. L. Parsons 


This Journal, a quarterly, contains in addition, notes, reviews of books and 
articles, and a list of recent publications and is published in February, May, August, 
November by the American Farm Economic Association. Yearly subscription $5.00. 


Secretary-Treasurer: ASHER HOBSON 


Department of Agricultural Economics 
University of Wisconsin, Madison, Wisconsin 
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Nationally Used 


Two leadin g texts which provide 


a complete training in accounting 


Paton’s 
ESSENTIALS OF ACCOUNTING 


“Worthy to be set beside anything which has yet been 
written in this field."—The Accountant (London) 


“Of invaluable aid to an embryo accountant.”—The 


Wharton Review 


“The book is explicit and reads easily. It emphasizes the 
necessity of sound thinking on accounting matters. The 
author has inculcated his ideas and extensive teaching 
experiences into this volume, unmistakably a valuable 
contribution to accounting literature."—Harvard Busi- 


ess Review $5.00 


ADVANCED ACCOUNTING 


“There is perhaps, no other accountant who equals Pro- 
fessor Paton in keenness of insight, careful analysis and 


penetrating interpretation.”—Journal of Accountancy 


“This book is characteristic of Professor Paton. It is at 
once a monument to his indefatigable energy and the 
testament of his accounting and his moral philosophy.”— 


American Economic Review $5.00 


Problems and Practice Sets are available for 
use with these books 
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—— The Macmillan Company —— 
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Applied General Statistics 


By Frederick E. Croxton, Columbia University 
Dudley J. Cowden, University of North Carolina 





This clear, complete and easily teachable text presents a 
more comprehensive treatment of general statistics than 
any other text now in the field. Compressing within its 
944 pages all of the material necessary for a course in 
elementary statistics, it includes such subjects as: sta- 
tistical reliability and significance; fitting of normal 
curves, binomials and skewed curves; analysis of variants, 
and linear, non-linear and multiple correlation. Adopted 
by hundreds of leading colleges, Applied Genera Sta- 
tistics has received wide praise for the many teaching 
aids offered—diagrams, charts and compete tables. An 
outstanding feature is the discussion of the main body 


of material on an elementary level. 


944 pages a7" College List, $4.25 


Practical Business Statistics 


By F. E. Croxton, and D. J. Cowden 


A unique and authoritative text on how statistics func- 
tion in business. All of the examples are carefully drawn 
from the records of large organizations, and represent 
all of the many jobs that statistics perform. It does not 
require a mathematical background beyond first year 


algebra. Profusely illustrated. 
529 pages av’ College List $3.75 






Send for your approval copies 


a PRENTICE-HALL, INC. 70 FIFTH AVENUE, NEW YORK 11 
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- | Mouvris M1 Blain —— 
ELEMENTARY STATISTICS 


This text in elementary statistics treats statistical theory clearly and 
concisely, and through careful use of non-technical language the 
exposition is unusually readable and interesting. The author, who 
has had many years of experience with beginners in this field, has 
planned his book particularly for students of business administration, 
although its applications cover the general field as well. 


Exceptionally valuable to both student and teacher alike are the 
excellent tables, worksheets, graphs, and diagrams. 





1944 690 pages $3.50 


Helen IN. Walker. 
| 


ELEMENTARY STATISTICAL METHODS 


Highlighting the many outstanding features of this text are its 
clarity of exposition, its simplified use of the latest developments in 
the mathematical theory of statistics, the adaptability of the book 
to various types of courses, and its many valuable exercises and 
problems. The author’s fresh, direct approach to involved concepts 
assures the soundest possible basis for understanding statistical theory 
and practice. 


1943 368 pages $2.75 


MATHEMATICS ESSENTIAL FOR 
ELEMENTARY STATISTICS 


Professor Walker's self-teaching text provides an excellent review 
of all the mathematics needed in the study of elementary statistics. 


246 pages $1.90 


HENRY HOLT AND COMPANY 


257 Fourth Avenue, New York 10 
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Available Yow... 


A Completely Revised Edition of 


The Money Value 
Of A Man 


By LOUIS I. DUBLIN, PA.D., Second Vice 
President and Statistician; and ALFRED J. 
LOTKA, D.Sc., Assistant Statistician, Metropoli- 
tan Life Insurance Company 


tee BOOK has long been accepted as a standard guide for deter- 
mining the money values of persons at various ages according 
to their earnings. In this newly revised edition extensive recomputa- 
tions have been made in the tables to conform to altered conditions, 
resulting largely from lowered interest rates and increased, expectancy 
of life. Also, the structure of the final tables showing the money 
value of a man by age and income has been remodeled, with definite 
advantage to the user. 

The book will be of specific value to statisticians and economists ; 
to health officers and social workers interested in the costliness of 
disease and premature death; to judges, lawyers and compensation 
boards for fair awards for personal injury and incapacitation ; and to 
insurance agents to determine the insurance prospects should carry. 


OUTLINE OF CONTENTS: 








Historical Retrospect Disease and the Depreciation of the 
The American Family Money Value of a Man 
Cost of Bringing Up A Child Application to Public Health 
Income in Relation to Age and Eco- Application to Life insurance 

nomic Status Social Insurance in Relation to the 
The Money Value of a Man as a Wage- Money Value of a Man 

Earner Age Schedules of Family Consumption 
The Burden of the Handicapped Units and Savings 


Valuation of Indemnity for Personal Effect, Upon Money Values, of 
Injury or Death Changes in Basic Data 


57 Tables 8 Charts Price $6.00 
THE RONALD PRESS COMPANY 
15 East 26th Street New York 10, N.Y. 
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Students in Psychology 


and Education 


by ALLEN L. EDWARDS 
360 pp.., illus. Published June, 1946 $3.50 


Designed to be understood by the nonmathematically-trained student, this 
new text is a modern approach to statistical theory. Direct, concise, and 
simply written; for students in psychology, education, social science, and 
biometrics. 


A Simplified Guide 


to Statistics 


For Psychology and Education 
By G. MILTON SMITH 
109 pp. Published October, 1946 $1.25 


The second edition of this useful manual provides a working knowledge of 
the statistical tools and concepts employed in psychology and education. 
This thoroughly revised edition includes new material and a number of exer- 
cises, and is distinguished for its clarity and teachability. 


Elementary Accounting 


By ARNOLD W. JOHNSON 
842 pp. Published September, 1946 $5.00 


This thoroughly revised and enlarged edition of Principles of Accounting 
offers an unusually complete coverage of the first year's work. It employs 
modern procedures, is developed in a logical manner, and contains a wealth 
of illustrative material. Six practice sets plus a set of 145 working papers 
are available separately. 


RINEHART & COMPANY, Inc. 


232 Madison Avenue, New York, 16 
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THE SOCIAL FRAMEWORK 
OF THE AMERICAN 
ECONOMY 


An Introduction to Economics 
By J. R. HICKS 


Stanley Jevons Professor of Political Economy in the 
University of Manchester 


ana 
ALBERT G. HART 


Visiting Professor of Economics, Columbia University 


SING the work of economic statisticians as a basis, this 
U brilliant new book presents an important innovatior in 

the study of elementary economics. Some time ago Pro- 
fessor Hicks perceived that the measurement of national income— 
or, as he calls it, Social Accounting—would serve as an excellent 
integration of the theoretical and descriptive aspects of economics, 
and thus as an introduction to the entire field. An outgrowth of 
this observation was The Social Framework, which appeared in 
England in 1942. Professor Hart has now prepared an edition for 
American readers, replacing English statistics and examples with 
corresponding American material. Now in its fourth printing, it 
has been received enthusiastically by professors, research economists 
and laymen alike. 


Simon Kuznets, Bureau of Economic Research, says of it: 


‘THE SOCIAL FRAMEWORK OF THE AMERICAN ECON- 
OMY impresses me as a highly useful, lucid exposition of the 
basic features of an advanced economy such as that of the United 
States. Its lively style should make it palatable to undergraduate 
students; its use of statistical data is proper, since national income 
is the modern version of the Tableau Economique and provides 
the best exposition of the leading ideas of economic study.’ 


280 pages $2.75 
OXFORD UNIVERSITY PRESS 
114 Fifth Avenue New York 11, N.Y. 
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MATHEMATICAL TABLES 


prepared by the 





Mathematical Tables Project 
National Bureau of Standards 


TABLE OF SPHERICAL BESSEL FUNCTIONS, Vol. | $7.50 
TABLE OF CIRCULAR AND HYPERBOLIC TANGENTS AND 
COTANGENTS FOR RADIAN ARGUMENTS $5.00 
TABLES OF LAGRANGIAN INTERPOLATION COEFFI- 
CIENTS $5.00 
TABLES OF RECIPROCALS OF THE INTEGERS FROM 
100,000 THROUGH 200,009 $4.00 
TABLE OF THE BESSEL FUNCTIONS J,(z) AND J,(z) FOR 
COMPLEX ARGUMENTS $7.50 
TABLES OF ASSOCIATED LEGENDRE FUNCTIONS $5.00 
TABLES OF FRACTIONAL POWERS $7.50 
TABLE OF ARC SIN X $3.50 


Forthcoming Volumes 


TABLE OF SPHERICAL BESSEL FUNCTIONS, Vol. I! 


TABLE OF BESSEL FUNCTIONS Y,{z) AND Y,(z) FOR COMPLEX 
ARGUMENTS 


TABLE OF BESSEL FUNCTIONS OF FRACTIONAL ORDERS 
TABLE RELATED TO THE MATHIEU DIFFERENTIAL EQUATION 


COLUMBIA UNIVERSITY PRESS © New York 27 
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The Annals of Mathematical Statistics 
(Founded by H. C. Carver) 


THE OFFICIAL JOURNAL OF THE INSTITUTE 
OF MATHEMATICAL STATISTICS 





CONTENTS 
a : PAGE 
Sample Criteria for Testing Equality of Means, Equality of Variances, 
and Equality of Covariances in a Normal Multivariate Distribution. 


I ih a aL a 257 
Contributions to the Theory of Sequential Analysis IJ, III]. M. A. 

a a 282 
Sufficient Statistical Estimation Functions for the Parameters of the 

Distribution of Maximum Values. Braprorp F, KIMBALL ........ 299 


On Functions of Sequences of Independent Chance Vectors with Ap- 
plications to the Problem of the “Random Walk” in & Dimensions. 


BA SEE, GON GE. Bi, GUI, nv ccc cscesccccccosecscece 310 
Approximation of the Distribution of the Product of Beta Variables by 

a Single Beta Variable. Joun W. Tukey anp S, S. WILks ........ 318 
Some Fundamental Curves for the Solution of Sampling Problems. 

ND icccgantieeddunsabensiadneeeeteieewnestuee 325 
Enlargement Methods for Computing the Inverse Matrix. Louts Gutt- 

BE hinder kavd6sececcesbetessedeseccenwunsesentesncsseoeesos 336 


The Frequency Distribution of Deviates from Means and Regression 
Lines in Samples from a Multivariate Normal Population. D. J. 


ED i00buneGueindnnsenscenssseeeseensaceeenees cbeeteceweee 344 
On the Asymptotic Distributions of Certain Statistics used in Testing 
the Independence between Successive Observations from a Normal 
Peseta, F. B. BD cccccscccccccccccccccccsessscccsesecees 350 
Notes: 
Estimating the Parameters of a Rectangular Distribution. A. GEORGE 
RE Scccbsckndannscesaddesdieencieenetasonennreneesenees 355 
On the Power Function of the Sign Test for Slippage of Means, JoHN 
Fe EE cs ee wienddkuekssen0Suiesnsshennsenss0seseowseeees 358 


An Approximation to the Probability Integral. J. D. Witttams .... 363 


Distribution of the Ratio of Sample Range to Sample Standard Devia- 
tion for Normal and Combinations of Normal Distributions. G. A. 
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Vol. XVII, No. 3—September, 1946 


Subscription Rate: $5.00 per annum 


Address orders for subscriptions and back numbers to Professor Paut 5. 
Dwyer, Secretary, Institute of Mathematical Statistics, University of 
Michigan, Ann Arbor, Michigan 
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| Fine craftsmanship and modern produc- 
tion methods produce the Fully Automatic 
FRIDEN ...the ultimate in Calculators. 
It is easy for you to become a master 

in Figure Work, using the many exclu- 
sive features embodied in this calculating 
device. Speak with your local Friden 
Representative, learn why deliveries must 
be anticipated...then order Today! 


Friden Mechanical 
and Instructional Service is 
available in approximately 250 
Company Controlled Sales Agencies 


throughout the United States and Canada. 


FRIDEN CALCULATING MACHINE CO., INC. 


HOME OFFICE AND PLANT - SAN LEANDRO, CALIFORNIA, U.S.A. + SALES AND SERVICE THROUGHOUT THE WORLD 
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THE MATHEMATICIAN’S 
CALCULATOR 


Today’s 
Highest 
Calculator 
Performance 


Marchant fills a unique place in 
statistical and mathematical computing. Here area fewreasons: 


True-figure dials for all three e Selective automatic tabulation 
factors give you dial-figure useful in multiplication as well 
proof of a// entered factors and as division. 

the answer. Send for #846 Index of FREE 
Complete capacity carry-over in pamphlets on Marchant Methods 
for Statistical Mathematics. No 


all carriage dials. 
obligation. 


Multiplies positively or nega- 
tively during entry of multiplier 
at 1,300 counts per minute; 


keeps pace with the setting of 

the multiplier and yields the — , 
answer not more than halfa Eg 

second after the last figure is 

entered. (YN) Q RC H Q) NT 





Automatic optional two-way 
carriage shift. Ml) S/LENT SPEED ELECTRIC 
CALCULATORS 


THIRTY-SIXTH YEAR 


Marchant Calculating Machine Company » Home Office: Oakland 8, California, U. S. A 
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